[mrtg] Re: Lockfiles and program flow
Daniel J McDonald
dan.mcdonald at austinenergy.com
Mon Oct 10 15:38:01 MEST 2005
On Mon, 2005-10-10 at 07:23 -0500, Volk,Gregory B wrote:
> > Mrtg 2.10.15 running on Sun V440 on Solaris 8 using rrdtool 1.0.50
> >
> > Two lockfiles are created, $lockfile and $templock. $templock
> > disappears
> > early on--haven't figure out when exactly, but by the time
> > the SNMP metrics
> > are being collected it seems to be gone. Is this by design? $lockfile
> > disappears when "Remove lock files" shows up in log (with
> > --debug=base) --
> > once the entire run has completed.
> >
> > How can I tell if my runs are completing in within 15
> > minutes? What is going
> > on here?
> >
I've experienced the same thing - with a 30 second default timeout, if
any large switch is down, the whole instance times out. My general
corrective is to try to change the timeout to 3 seconds, but that
doesn't always solve the issue either.
I've considered ripping the whole polling engine out and re-writing it
to use snmpbulk-get. But I've long thought that the locking mechanism
wasn't working correctly, and your experience seems to bear that out.
My environment is a single instance with about 5000 targets, run from
cron every 5 minutes. Assuming everything is up, it runs in about 1.5
minutes. When a router somewhere goes down, it often just hangs. The
dead host detection doesn't seem to function - I think that's a fork
problem (where a target in one fork detects a dead target, but the other
forks don't realize it and continue to try to poll it).
--
Daniel J McDonald, CCIE # 2495, CNX, CISSP # 78281
Austin Energy
dan.mcdonald at austinenergy.com
--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
Archive http://lists.ee.ethz.ch/mrtg
FAQ http://faq.mrtg.org Homepage http://www.mrtg.org
WebAdmin http://lists.ee.ethz.ch/lsg2.cgi
More information about the mrtg
mailing list