[mrtg-developers] proposed mrtg performance improvements?
Dave Plonka
plonka at doit.wisc.edu
Thu Nov 20 18:49:42 MET 2003
MRTG developers,
I find myself trying to improve the performance of MRTG with very many
targets, and want to run some ideas/experiences by you before I do more
work on the code (using 2.10.5 right now):
0) In preface, I currently have about 33,000 targets in my ".cfg" file
and am running on an SMP Linux box with eight 700MHz PIII processors
with MRTG configured thusly:
LogFormat: rrdtool
RunAsDaemon: Yes
Interval: 5
Forks: 16
It works... almost. However, it often takes more than five minutes
per cycle, so a lot of my data points in the RRD files are repeated
twice (because heartbeat is at the default of 600 seconds.)
My goal is to improve the situation so that I can double the number
of targets... (this is to collect the pkts/bytes/and errors for
nearly every switch-port in the campus.)
1) So far, the only change I have made to speed things up, when Forks
is used, is to use select(2) to process the results from each child
as soon as possible.
Previously, mrtg would process the children sequentially, ie. it
would always wait for the first child, then the second, third, and
so on. I noticed that sometimes one child would take longer, and
therefore slow down the whole process unnecessarily.
This patch helped a little, and is only a bit of code.
I'll send a patch to Tobi.
2) I was disappointed to see that the Forks option only affects the
"readtargets" phase, not the "target loop" phase. I need to speed
up the "target loop" too.
I added some performance log messages using Benchmark.pm, and
currently get statistics like this:
--time: 2003/11/20 11:35:31 target loop took 235 wallclock secs (45.21 usr + 188.03 sys = 233.24 CPU)
--time: 2003/11/20 11:38:21 readtargets took 170 wallclock secs ( 1.68 usr 2.08 sys + 282.51 cusr 120.66 csys = 406.93 CPU)
--time: 2003/11/20 11:42:26 target loop took 245 wallclock secs (44.60 usr + 194.46 sys = 239.06 CPU)
--time: 2003/11/20 11:45:14 readtargets took 168 wallclock secs ( 1.76 usr 2.98 sys + 278.46 cusr 130.97 csys = 414.17 CPU)
As you can see the target loop is taking too long. (One wants the
combination of "readtargets" and the "targetloop" to take less than
300 wallclock seconds.)
So, some of my options to improve MRTG's performance are:
A) use something like POE (http://poe.perl.org) to parallelize the
SNMP get requests (ie. "getcurrent" in MRTG) so that multiple
requests can be in progress at a time, and RRD files can be updated
while we're waiting for SNMP get responses.
Simon Leinen suggested this idea to me and Todd Caine, author of
POE::Component::SNMP, has apparently done such a thing in a system
of his own.
This would require the use of Net::SNMP (because of its async API),
which can probably peacefully coexist with SNMP_Session, but is a
big code change.
B) parallelize the "getcurrent" calls by forking multiple children.
This is probably a fairly simple change to the MRTG code...
Ideally, the targets would be split up in such a way so that
each device would only be queried from one child.
C) paralellize both the "getcurrent" and "writegraphics" calls.
This would enable concurrent SNMP queries ("getcurrent") and RRD
file updates ("writegraphics") but may hit the machine too hard if
many child processes try to update files simultaneously.
Anyway, just wanted to see if the community knows of others who have
done this have substantive comments on these options.
Thanks,
Dave
--
plonka at doit.wisc.edu http://net.doit.wisc.edu/~plonka ARS:N9HZF Madison, WI
--
Unsubscribe mailto:mrtg-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:mrtg-developers-request at list.ee.ethz.ch?subject=help
Archive http://www.ee.ethz.ch/~slist/mrtg-developers
More information about the mrtg-developers
mailing list