[mrtg-developers] Re: proposed mrtg performance improvements?

Fri Nov 21 00:37:19 MET 2003

Hi Tobi,

On Thu, Nov 20, 2003 at 09:39:33PM +0100, Tobias Oetiker wrote:
> Hi Dave,
> 
> you select patch sounds cool ...

OK, I'll prepare that patch and send it to you for your consideration.

> the other bits I am not quit shure about.
> Wouldn't it be enought to split your cfg file in 2 parts and
> run them in parallel (to cater for the needs of your smp system)

Ah, I should have mention I tried that - as an alternative to forks.

We wrote a utility to split the list of Targets from the one big ".cfg"
file into `n' parts.  Then I launched them (with RunAsDaemon) at equal
intervals throughout five minutes in an effort to have them stagger
their polling and disk i/o.  (A similar effect to what
gary.post at amd.com suggested in his follow-up.)

What I noticed was that, for instance, 13 seperate MRTG processes use
way more CPU than one MRTG with 16 forked children.  With 13 seperate
mrtg processes, the load average would often go over 20 (when compbined
with some other work on the 8 CPU machine) but not work any better than
one mrtg process which would cause the load average to go only to 3-5.
I think this was due, in part, to them all reading/writing RRD files at
the same time.

Perhaps I should try a small number of seperate MRTG processes (maybe
4; I don't want them to take over all 8 processors) with many Forks per
process would do the trick.

> ... what we want to achive is, that the cpu is working 100%, right ? ...

Not exactly.  What I want is to get each target polled within every 300
wallclock seconds, with the least CPU usage possible.

> The snmp acquision phase has an enormouse amount of
> latency, I added the forking there ... for the other bits I do not
> see where the latency comes in ...

But don't you consider "getcurrent" part of the "snmp acquision phase"?
I agree with you that "readtargets" benefits greatly from parallel
handling with the forked children.

>From my reading of the code, the mrtg pseudocode looks something like this:

  for (;;) { # ever
    readtargets();       # get the current state of the routers, uses Forks...
    foreach $target (@targets) {
	getcurrent();    # get the current values for the target
	writegraphics(); # update the RRD file for that target
    }
  }

And the "getcurrent" part is not parallelized currently.

I guess I just wanted to know if I'm missing some subtle detail about
why perhaps the getcurrent/writegraphics shouldn't be similarly
performed in parallel by the forked children.

Since there are these short waits for network i/o (between the getcurrent
SNMP gets and response) it seems that's the opportunity to interleave
some of the disk i/o.

> Have you used the perl profiler on the code, this might reveal
> where the hotspots are in the code ...

No, I might try that.

Anotehr thing I'm wondering about is trying to avoid the often
unnecessary RRDs::tune and "threscheck" calls.  Perhaps the mrtg
process could remember the previous tune that it performed so as
not to do it repeated in RunAsDaemon mode.

Thanks,
Dave

-- 
plonka at doit.wisc.edu  http://net.doit.wisc.edu/~plonka  ARS:N9HZF  Madison, WI

--
Unsubscribe mailto:mrtg-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:mrtg-developers-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/mrtg-developers