[mrtg] Re: Lockfiles and program flow
greg.volk at edwardjones.com
Mon Oct 10 14:23:08 MEST 2005
> Mrtg 2.10.15 running on Sun V440 on Solaris 8 using rrdtool 1.0.50
> 15 instances of mrtg are running from cron once a minute at a
> 15 minute interval. So there is an instance of mrtg starting every
> Total number of targets almost 60,000
> **Serious** I/O problems, which we are working on to resolve by
> adding hardware, but in the meantime management wants us to try to
> "make it work" on the 2 striped disks we have.
Wow, I thought I was the only person running MRTG on a big,
overpriced, slow Sun box these days. ;) I poll 25,060 two
variable targets on a five minute interval. I do this on top
of Solaris 8 on V440 hardware (4 procs, 8 gigs) but I store all
the RRD files on EMC dasd. Prior to moving the WorkDir to the
EMC my system was falling over because it was so always really
I/O bound. I know EMC is a bitter, expensive, pill to swallow,
but without it, the processing horsepower of a V440 pretty
much goes to waste (at least in my case it does).
Your 15 config files must be gargantuan! I spread out my 25,000
targets among 80 MRTG invocations running in daemon mode.
If you can't swing the dollars for an EMC frame, you might want
to look into a RAM filesystem - if the V440 can accomodate
enough RAM that would allow you to put your WorkDir on a RAM
filesystem that would get around the disk IO issues you have,
but of course then you have volatility issues in the event of
a crash. :(
> Two lockfiles are created, $lockfile and $templock. $templock
> early on--haven't figure out when exactly, but by the time
> the SNMP metrics
> are being collected it seems to be gone. Is this by design? $lockfile
> disappears when "Remove lock files" shows up in log (with
> --debug=base) --
> once the entire run has completed.
> How can I tell if my runs are completing in within 15
> minutes? What is going
> on here?
I'm not sure what $templock is, but you might want to turn
on logging via the...
...command line directive, but I don't know if that command
option works outside of daemon mode. I'm thinking that
debugging the base set will not tell you whether or not
you're meeting the time limits.
If --logging doesn't tell you anything in cron mode, try
running one of your 15 config files in Daemon mode by putting
RunAsDaemon:Yes at the top, and then turn on logging via the
command line. I am certain that an 'interval exceeded' message
will appear in the log file if the poll cycle takes too
long - I've seen it happen in my own stuff in the past.
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
FAQ http://faq.mrtg.org Homepage http://www.mrtg.org
More information about the mrtg