[mrtg-developers] proposed mrtg performance improvements?

Thu Nov 20 18:49:42 MET 2003

MRTG developers,

I find myself trying to improve the performance of MRTG with very many
targets, and want to run some ideas/experiences by you before I do more
work on the code (using 2.10.5 right now):

 0) In preface, I currently have about 33,000 targets in my ".cfg" file
    and am running on an SMP Linux box with eight 700MHz PIII processors
    with MRTG configured thusly:

       LogFormat: rrdtool
       RunAsDaemon: Yes
       Interval: 5
       Forks: 16

    It works... almost.  However, it often takes more than five minutes
    per cycle, so a lot of my data points in the RRD files are repeated
    twice (because heartbeat is at the default of 600 seconds.)

    My goal is to improve the situation so that I can double the number
    of targets...  (this is to collect the pkts/bytes/and errors for
    nearly every switch-port in the campus.)

 1) So far, the only change I have made to speed things up, when Forks
    is used, is to use select(2) to process the results from each child
    as soon as possible.

    Previously, mrtg would process the children sequentially, ie. it
    would always wait for the first child, then the second, third, and
    so on.  I noticed that sometimes one child would take longer, and
    therefore slow down the whole process unnecessarily.

    This patch helped a little, and is only a bit of code.
    I'll send a patch to Tobi.

 2) I was disappointed to see that the Forks option only affects the
    "readtargets" phase, not the "target loop" phase.  I need to speed
    up the "target loop" too.

    I added some performance log messages using Benchmark.pm, and
    currently get statistics like this:

       --time: 2003/11/20 11:35:31 target loop took 235 wallclock secs (45.21 usr + 188.03 sys = 233.24 CPU)
       --time: 2003/11/20 11:38:21 readtargets took 170 wallclock secs ( 1.68 usr  2.08 sys + 282.51 cusr 120.66 csys = 406.93 CPU)
       --time: 2003/11/20 11:42:26 target loop took 245 wallclock secs (44.60 usr + 194.46 sys = 239.06 CPU)
       --time: 2003/11/20 11:45:14 readtargets took 168 wallclock secs ( 1.76 usr  2.98 sys + 278.46 cusr 130.97 csys = 414.17 CPU)

    As you can see the target loop is taking too long.  (One wants the
    combination of "readtargets" and the "targetloop" to take less than
    300 wallclock seconds.)

So, some of my options to improve MRTG's performance are:

 A) use something like POE (http://poe.perl.org) to parallelize the
    SNMP get requests (ie.  "getcurrent" in MRTG) so that multiple
    requests can be in progress at a time, and RRD files can be updated
    while we're waiting for SNMP get responses.

    Simon Leinen suggested this idea to me and Todd Caine, author of
    POE::Component::SNMP, has apparently done such a thing in a system
    of his own.

    This would require the use of Net::SNMP (because of its async API),
    which can probably peacefully coexist with SNMP_Session, but is a
    big code change.

 B) parallelize the "getcurrent" calls by forking multiple children.

    This is probably a fairly simple change to the MRTG code...
    Ideally, the targets would be split up in such a way so that
    each device would only be queried from one child.

 C) paralellize both the "getcurrent" and "writegraphics" calls.

    This would enable concurrent SNMP queries ("getcurrent") and RRD
    file updates ("writegraphics") but may hit the machine too hard if
    many child processes try to update files simultaneously.

Anyway, just wanted to see if the community knows of others who have
done this have substantive comments on these options.

Thanks,
Dave

-- 
plonka at doit.wisc.edu  http://net.doit.wisc.edu/~plonka  ARS:N9HZF  Madison, WI

--
Unsubscribe mailto:mrtg-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:mrtg-developers-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/mrtg-developers