[mrtg] MRTG/RRD Performance tests: preliminary results
s.shipway at auckland.ac.nz
Mon Sep 13 04:00:12 CEST 2010
Here are the preliminary results for my performance testing of MRTG/RRD. It is very interesting that some things which I had thought to be significant turned out not to be, but others (such as daemon mode) were much more important. As a result of this investigation, I expect to be making serious changes to our organisation's configuration in order to improve performance.
All tests were done with a dual 2.5GHz Xeon system, 4GB memory, RedHat Enterprise Linux 5.4.
3164 SNMP Targets over 32 devices, with a clean pass (no SNMP errors).
In all cases, a clean directory was used, and the RRD files created by an initial pass before running the test.
Strangely, running RTG in Daemon mode (as opposed to scheduled mode) moves the CPU usage from 'Nice' to 'User' classification.
Native mode vs. RRD mode http://oss.oetiker.ch/mrtg/doc/mrtg-rrd.en.html
As expected, running in RRD mode is much more efficient. In RRD mode, you use only 35% of the CPU requirements, although disk IO is not greatly changed.
There is no significant change in the CPU usage between RRDTool 1.2.12, 1.3.9 and 1.4.4. RRDTool 1.4.4 seems to require slightly more disk I/O when run in Scheduled mode, but less when run in Daemon mode - probably due to its more efficient memory mapped I/O.
MRTG Daemon mode
Running MRTG in daemon mode allows it to keep file descriptors open, cache the configuration file contents, and so on. The performance gain seems to be consistently in the region of 20%, mo matter which version of RRDTool you are using.
RRDTool Caching Daemon http://oss.oetiker.ch/rrdtool/doc/rrdcached.en.html
The rrdcached was introduced with RRDTool 1.4, and allows a separate daemon to take control of disk writes. For testing, this was left with default values, but if configured with more aggressive (and risky) caching you could make serious gains in I/O.
This seems to have very little impact on overall CPU use -- remember we're taking into account the requirements of the caching daemon in addition to those of MRTG - although you could run the rrdcached process on a separate server to spread the load or achieve a distributed setup.
However, the gains in disk IO are significant, particularly if you take into account that any web front-end can also benefit from accessing the cache rather than the data files. I was experiencing a saving of 60%+ in disk writes, although this will probably lessen if a much larger sample window is taken (I'll be running more tests on this in the future).
To make MRTG use the rrdcached, you need to set the RRDCACHED_ADDRESS environment variable before starting.
FIlesystem mount options
Still to be tested but not anticipated to have any significant impact
If performance is your primary goal, then the best course of action appears to be:
1. Use RRDTool, not Native mode for MRTG
2. Run MRTG in daemon mode rather than via cron or other scheduler
3. Use RRDTool 1.4.x in order to benefit from enhanced disk IO routines
4. Use the rrdcached, with as large a timeout (-w) as you dare (default is 5min). Set the delay (-z) if you experience disk IO going in bursts. Increase the threads (-t) if you have fast disk (default 4).
5. Consider running multiple MRTG processes on separate servers, with the rrdcached process (and web frontend) on a separate server. Since the majority of the load comes from MRTG this allows more processes before the disk IO is saturated.
I will be doing more investigation on a distributed MRTG setup using the rrdcached, which I hope to be able to demonstrate at LISA10 this year. I will also be adding a section to the MRTG book on the new rrdcached features, once I've checked them through thoroughly.
All feedback welcome...
ITS Unix Services Design Lead
University of Auckland
Floor 2, 58 Symonds Street
09 3737599 ext 86487
P Please consider the environment before printing this e-mail
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mrtg