[mrtg] Re: Lockfiles and program flow

Daniel J McDonald dan.mcdonald at austinenergy.com
Mon Oct 10 15:38:01 MEST 2005


On Mon, 2005-10-10 at 07:23 -0500, Volk,Gregory B wrote:
> > Mrtg 2.10.15 running on Sun V440 on Solaris 8 using rrdtool 1.0.50

> > 
> > Two lockfiles are created, $lockfile and $templock. $templock 
> > disappears
> > early on--haven't figure out when exactly, but by the time 
> > the SNMP metrics
> > are being collected it seems to be gone.  Is this by design? $lockfile
> > disappears when "Remove lock files" shows up in log (with 
> > --debug=base) --
> > once the entire run has completed.
> >
> > How can I tell if my runs are completing in within 15 
> > minutes? What is going
> > on here?
> > 

I've experienced the same thing - with a 30 second default timeout, if
any large switch is down, the whole instance times out.  My general
corrective is to try to change the timeout to 3 seconds, but that
doesn't always solve the issue either.

I've considered ripping the whole polling engine out and re-writing it
to use snmpbulk-get.  But I've long thought that the locking mechanism
wasn't working correctly, and your experience seems to bear that out.

My environment is a single instance with about 5000 targets, run from
cron every 5 minutes.  Assuming everything is up, it runs in about 1.5
minutes.  When a router somewhere goes down, it often just hangs.  The
dead host detection doesn't seem to function - I think that's a fork
problem (where a target in one fork detects a dead target, but the other
forks don't realize it and continue to try to poll it).


-- 
Daniel J McDonald, CCIE # 2495, CNX, CISSP # 78281
Austin Energy

dan.mcdonald at austinenergy.com

--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
Archive     http://lists.ee.ethz.ch/mrtg
FAQ         http://faq.mrtg.org    Homepage     http://www.mrtg.org
WebAdmin    http://lists.ee.ethz.ch/lsg2.cgi



More information about the mrtg mailing list