[rrd-users] Re: DERIVE or COUNTER

BAARDA, Don don.baarda at baesystems.com
Mon Aug 20 03:05:22 MEST 2001


G'day,

> -----Original Message-----
> From: Michael Wells [mailto:michael at wells.org.uk]
> Sent: Sunday, August 19, 2001 12:27 AM
[...]
> > No, setting maximum values for the counters is not a 
> complete solution. If
> > the counters get reset when the value of the counter is 
> high, you will get a
> > wrap that produces a wrong result that is also less than 
> the maximum you set
> > and thus will be recorded as a value. You still get wrong 
> data in your
> > database.
> 
> I read the manual page and see your point. Would it
> be fair to say that *any* COUNTER that is zeroed on reboot would be
> better off being a DERIVE? I'm not sure that detecting the difference
> between an overflow and a zeroing is worth attempting, and 
> that seems to
> mean this is the only 'correct' solution.

If you look back through the archives you will see that Alex and I had a big
discussion about this. The conclusion I came to (which might be different to
Alex's conclusion :-) was this;

There are two options; DERIVE with min=0 or COUNTER with max=<somevalue>.
Both will eliminate unreasonable spikes.

DERIVE will correctly give "Unknown" for all counter resets, but also
incorrectly give "Unknown" for all legitimate counter wraps.

COUNTER will give valid results for all legitimate counter wraps, mark some
counter resets correctly as "Unknown", but will give invalid values for some
other counter resets. The probability of incorrectly giving an invalid value
instead of "Unknown" for a counter reset depends on how much the counter
increments between samples relative to the counter size. A small "max", a
short "step", or a 64 bit counter makes the probability of an invalid value
extremely low (ask if you want formula's for the probabilities).

So it depends on what you consider more serious, marking all legitimate wrap
as "Unknown", or getting an invalid value for some counter resets. The
probability of COUNTER "getting it wrong" might also sway you.
Unfortunately, when you most want to avoid "Unknowns" for counter wraps is
when they frequently occur, which is also when the probability of getting a
counter reset wrong is highest. Perhaps the frequency of counter wraps
relative to resets is worth considering too.

The best solution I have heard is using a front-end that checks router
uptime to detect resets. This, in combination with COUNTER should be the
best. However, this will still potentially make mistakes if the router can
reset it's counters without affecting uptime. If in doubt, use a 64 bit
counter, as even for a gigabit interface it will be 4676 years before there
is a counter-wrap.

ABO

--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-users
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the rrd-users mailing list