[mrtg] Re: Lockfiles and program flow

Tue Oct 11 15:41:32 MEST 2005

Actually, we have Hitachi DASD readily available -- an advantage to working
in a large organization like the US federal government :-) -- and moved all
60,000 files over there.  We are running over 4 load-balancing fiber
channels and the I/O hurdle has now been overcome.  But I'm still wondering
why $templock disappears.

I've done all the things you suggest below (turn on debugging, run in daemon
mode, etc.) and the answer to the $templock issue still eludes me.  But
since the I/O problem has been resolved, I know the runs are completing in
time.

I suppose I don't have a problem "per se" anymore, but if Tobi or anyone
else can shed light on the $templock thing I'd be interested to hear it.

Tanya Ruttenberg - RSIS Contractor
OTSO/DNE/NMPEB
tanya.ruttenberg at ssa.gov
410-965-9605

-----Original Message-----
From: Volk,Gregory B [mailto:greg.volk at edwardjones.com] 
Sent: Monday, October 10, 2005 8:23 AM
To: Ruttenberg, Tanya; mrtg at list.ee.ethz.ch
Subject: RE: [mrtg] Lockfiles and program flow

> Mrtg 2.10.15 running on Sun V440 on Solaris 8 using rrdtool 1.0.50
>
> 15 instances of mrtg are running from cron once a minute at a
> 15 minute interval. So there is an instance of mrtg starting every 
> minute.
> 
> Total number of targets almost 60,000
> 
> **Serious** I/O problems, which we are working on to resolve by adding 
> hardware, but in the meantime management wants us to try to "make it 
> work" on the 2 striped disks we have.
> 

Wow, I thought I was the only person running MRTG on a big, overpriced, slow
Sun box these days. ;) I poll 25,060 two variable targets on a five minute
interval. I do this on top of Solaris 8 on V440 hardware (4 procs, 8 gigs)
but I store all the RRD files on EMC dasd. Prior to moving the WorkDir to
the EMC my system was falling over because it was so always really I/O
bound. I know EMC is a bitter, expensive, pill to swallow, but without it,
the processing horsepower of a V440 pretty much goes to waste (at least in
my case it does).

Your 15 config files must be gargantuan! I spread out my 25,000 targets
among 80 MRTG invocations running in daemon mode.

If you can't swing the dollars for an EMC frame, you might want to look into
a RAM filesystem - if the V440 can accomodate enough RAM that would allow
you to put your WorkDir on a RAM filesystem that would get around the disk
IO issues you have, but of course then you have volatility issues in the
event of a crash. :(

> 
> Two lockfiles are created, $lockfile and $templock. $templock 
> disappears early on--haven't figure out when exactly, but by the time 
> the SNMP metrics are being collected it seems to be gone.  Is this by 
> design? $lockfile disappears when "Remove lock files" shows up in log 
> (with
> --debug=base) --
> once the entire run has completed.
>
> How can I tell if my runs are completing in within 15 minutes? What is 
> going on here?
> 

I'm not sure what $templock is, but you might want to turn on logging via
the...
--logging mrtg.log
...command line directive, but I don't know if that command option works
outside of daemon mode. I'm thinking that debugging the base set will not
tell you whether or not you're meeting the time limits.

If --logging doesn't tell you anything in cron mode, try running one of your
15 config files in Daemon mode by putting RunAsDaemon:Yes at the top, and
then turn on logging via the command line. I am certain that an 'interval
exceeded' message will appear in the log file if the poll cycle takes too
long - I've seen it happen in my own stuff in the past.

Good luck.

--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
Archive     http://lists.ee.ethz.ch/mrtg
FAQ         http://faq.mrtg.org    Homepage     http://www.mrtg.org
WebAdmin    http://lists.ee.ethz.ch/lsg2.cgi