[mrtg] Re: provisioning question for a new MRTG server

Wed Jun 5 22:33:47 MEST 2002

> 
> I've looked for documentation on sizing MRTG w/RRDTOOL on 
> the web but haven't come up with anything, so far.  I 
> believe that the largest limitation is disk IO because 
> you have to read in, then write out each rrd file.
> 
> Opinions based on real experience is appreciated.  Pointers 
> to URLs where this has discussed before is welcome also.
> 

Recently I was told that I needed to migrate my MRTG data
collection and graph generation off of linux on x86, and onto
Solaris on sparc. I was given a dual 450Mhz Sun 220R with
2GB of memory, and a couple 10krpm drives. Just for kicks,
before it was officially collecting production data and while
the old system was still collecting data, I modified my
nightly auto-config file scripts to call cfgmaker with the
--no-down option. This caused all ports on all switches
irrelevant of admin or oper status to be added to the config
files for five minute polling. With 295 mrtg daemons running, 
polling a total of 18,084 targets I began to experience slow
response in my telnet sessions to the Sun box. The box had
exhausted the 2GB of RAM, and had used half of it's 3GB of 
available swap space. The 15 minute load was between 15 and 
20 for the 24 hours that the --no-down config files were used.
By far, the biggest problem was graph generation. I use 
14all.cgi and eventhough the average config file only had
61 targets waiting for that machine to push out 61 index
graphs was very tiresome - it's slow at graph generation even
when it isn't loaded, but with that many five minute targets
index graphs were painfully slow.

All in all, it was a good test and most importantly, I didn't
run into any issues like taking more than 300 seconds to complete
a one-config-file poll cycle. Things probably could have been
optimized more, such as reducing the number of daemons and 
increasing the number of targets per daemon, but I prefer to do
one daemon per large (cisco 6509) device for reliability and 
manageablility reasons.

As I said above, the biggest drawback was (and still is) graph 
generation. Those two 450Mhz procs just can't seem to pop out 
graphs the way I've seen dual 1.5Ghz Athlon systems pump them
out. However I don't know how a comparably equipped Athlon 
system would do when being given the task of polling 18084 five 
minute targets via 295 daemons.

--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
Archive     http://www.ee.ethz.ch/~slist/mrtg
FAQ         http://faq.mrtg.org    Homepage     http://www.mrtg.org
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi