[rrd-users] rrdtool doesn't detect integer overflow

Alex van den Bogaerdt alex at ergens.op.het.net
Wed Aug 22 03:41:34 CEST 2007

On Wed, Aug 22, 2007 at 12:10:50AM +0200, Alexander Koeppe wrote:

> sometimes my rrdtool graph want to tell me, that the Server did transfer
> about 100 Peto Byte at one time slice.
> It seems that rrdtoo doesn't get the overflow of the bytes counter.

It seems that it does, except that the counter didn't overflow
but was reset or misread.

> As a data source I'm parsing the /proc/net/dev file.
> Does someone know what's wrong? I've read that rrdtool DS:COUNTER
> detects automatically when the counter overflows.

If the previous counter was 2^32+1, and if you reset the counter
(perhaps by resetting the computer?) then its value is 0.

0-(2^32+1) is less than 0, so the overflow mechanism kicks in.

Add 2^32. Still negative? Yes. Add (2^64-2^32).
(Nett result: Add 2^64)

This new counter value is then used to compute a rate. Depending
on your time interval, you can get very high rates this way.

Assuming the standard 5-minute interval:

(0-(2^32+1)+2^64)/300 = 61,489,146,898,048,614 = 61.5 * 10^15
which is 61.5 Peta (not Peto).

Something similar will happen if your counter can go over 2^32
but wraps before 2^64. For instance, a wrap could occur at
9,999,999,999 -> 0

In this case, the counter wrap mechanism gets it wrong. It
isn't perfect and to the best of my knowledge it isn't
configurable either.

The math:

0-9,999,999,999 < 0 -> add 2^32, still < 0, add 2^64-2^32.
Divide by 300 and get 61.5 Peta as well.

Third possibility, probably more likely than the second one
in your case, is a parsing error. If numbers grow really big,
then sometimes spaces are lost.  Those spaces may be used by
you to find a certain column.  Perhaps you were no longer
looking at bytes transmitted but rather at packets transmitted,
just due to multicast (low number) and bytes transmitted (high
number) were fused together.  A variant of this problem would
be when numbers are written in a scientific kind of notation,
and your parser stops too soon:  1E10 is not equal to 1.

Alex van den Bogaerdt

More information about the rrd-users mailing list