[mrtg] MRTG "flat-line" after a while of service outage
amos.shapira at gmail.com
Fri Apr 13 01:28:55 CEST 2007
On 13/04/07, Eric Brander <mailinglists at rednarb.com> wrote:
First - thanks for taking the time to answer my questions.
On 4/12/07, Amos Shapira <amos.shapira at gmail.com> wrote:
> > Hello,
> > I've setup MRTG to graph the throuput of some internal software using an
> > external program and it worked well for a few weeks.
> > Yesterday we took down all the monitored services for a major database
> > overhaul and since then MRTG records the service's throughput as zero even
> > though the external program continue to show increasing counters of work
> > done.
> Unknownaszero handles this.
It's not set in my configuration.
For instance running the external program outputs:
> > 5990344
> > 5990344
> > 0
> > And a few seconds later it will show:
> > 5990414
> > 5990414
> > 0
Here are new samples, taken exactly 10 seconds apart:
The difference is 61, which suggest to me 6.1 per second.
Nothing has changed in the program or the MRTG configuration during that
> > time.
> Here is the relevant part from that counter's config file:
> > Target[total]: `total-ti-counter /mnt/monitored/*/total.txt`
> > MaxBytes[total]: 50
> > Directory[total]: total
> > Title[total]: Total
> There should me a lot more relevant info in your config file. This doesn't
> show the whole picture.
MaxBytes is way more than the expected throughput (the throughput based on
> > the numbers above are 7.8 per second) but even setting it to 1000
> > instead of 50 doesn't help change the situation.
> > What could be wrong?
> What does your .log file look like? Does MRTG error when it runs against
> this config and if so what is the error? What polling interval are you
> running? Any options such as perminute, perhour, etc. Bits or Bytes?
Here is the top of the log file:
1176419402 6585458 6585458
1176419402 0 0 0 0
Then these are zero's all the way back to just before the monitored services
were taken down, where it gives normal looking numbers.
I executed MRTG manually many times, also with --debug. No errors are given.
The polling interval is set to 5 minutes through cron job.
No other options are set. What you see there is all there is (the only field
I dropped from the e-mail was the "PageTop" setting).
>From the sample output you gave it looks like its more than 50, heck in the
> 2 samples over only a "few seconds" the difference is 70. And depending on
> that interval, you could easily exceed 1000 over a 5-minute polling cycle.
> Without knowing more I'd think your maxbytes setting is wrong by a handful
> of zeros. Your total.log file will show 2 entries and a bunch of zeros if
> the maxbytes is too low.
I though that MaxBytes is about the maximum expected speed of change, not
the total absolute counter value. (for instance when monitoring a LAN card -
MAxBytes will be about the maximum expected number of bytes the card can
transfer PER SECOND, not how many bytes the card can send or receive in a 5
minute interval) For instance at the above sample of 61 in 10 seconds,
MaxBytes would be compared to 6.1 for one second, not the absolute increase
of 1830 in 5 seconds, would it?
I'll try to increase MaxBytes to 5000 and see what happens.
> Eric Brander
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the mrtg