[mrtg] Re: forking details

Thu Nov 15 00:36:22 MET 2001

I have no clue on forks.
Could seperate in switches in different config files, but still could exceed
the polling cycle if polling a lot of ports on the same switch config file. 
You possibily should change timeout and retries for your targets, that way
if they are unavailable it goes to the next target faster. (ie using
defaults: 50 ports x(timeout = 2sec x retries = 5) =500seconds)ouch!!

-----Original Message-----
From: Greg.Volk at edwardjones.com [mailto:Greg.Volk at edwardjones.com]
Sent: Wednesday, November 14, 2001 10:01 AM
To: mrtg at list.ee.ethz.ch
Subject: [mrtg] forking details

I have a config file with 1600 targets in it. I'm running with 
logformat rrdtool, and I have daemon mode turned on. 

Everything works great, until a big device goes unreachable. When
this happens, the sum of all the timeouts on all those interfaces
for that one device end up pushing the total mrtg query runtime 
cycle beyond 300 seconds. At this point, all 1600 of the RRDs
stop getting regular updates. Normally, this is not a problem because
these huge switches are (hopefully) on-line and reachable. However
when our Saturday outage window rolls around every week, I end up
missing two or three hours of data for all 1600 targets because 
one large device may be down for maintenance. In an effort to 
rememdy the situation, I enable 4 forks in the config file. This
helps a little, but not as much as I had hoped. When the daemon
kicks off every five minutes, I see the four forks, but they seem
to finish up very quickly and then the main process hangs around 
for quite sometime doing the rest of the queries - I see lots of
snmp traffic via tcpdump, but only one mrtg process is still 
displayed by top. Once all the queries are finished,
the main process goes to sleep, and five minutes later it kicks
off the four short lived forks, and the cycle continues. So forking
only helps me marginally with this problem. Since there is still
one mrtg process doing the vast majority of the queries, if I have
a big device that's down, the rest of the queries that the main 
process is responsible for usually end up exceeding the data
collection interval.

My questions are as follows:

How does mrtg decide how many targets each fork takes care of?
In the beginning, I assumed if 4 forks were specified, each fork
would take 1/4 of the targets in the config file, but this does
not seem to be the case.

Can the targets per fork be specified?

I'm tempted to increase the fork directive, but this is only a 
single proc PIII-450 with 64megs of memory and I'm worried that
I'll either send it into swap fits, or end up spending so much 
time context switching that the query cycle time will begin to 
suffer.

Has anyone else run into this issue? 

--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
Archive     http://www.ee.ethz.ch/~slist/mrtg
FAQ         http://faq.mrtg.org    Homepage     http://www.mrtg.org
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi

--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
Archive     http://www.ee.ethz.ch/~slist/mrtg
FAQ         http://faq.mrtg.org    Homepage     http://www.mrtg.org
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi