[rrd-users] rrdtool doesn't detect integer overflow
Alex van den Bogaerdt
alex at ergens.op.het.net
Wed Aug 22 03:41:34 CEST 2007
On Wed, Aug 22, 2007 at 12:10:50AM +0200, Alexander Koeppe wrote:
> sometimes my rrdtool graph want to tell me, that the Server did transfer
> about 100 Peto Byte at one time slice.
> It seems that rrdtoo doesn't get the overflow of the bytes counter.
It seems that it does, except that the counter didn't overflow
but was reset or misread.
> As a data source I'm parsing the /proc/net/dev file.
> Does someone know what's wrong? I've read that rrdtool DS:COUNTER
> detects automatically when the counter overflows.
If the previous counter was 2^32+1, and if you reset the counter
(perhaps by resetting the computer?) then its value is 0.
0-(2^32+1) is less than 0, so the overflow mechanism kicks in.
Add 2^32. Still negative? Yes. Add (2^64-2^32).
(Nett result: Add 2^64)
This new counter value is then used to compute a rate. Depending
on your time interval, you can get very high rates this way.
Assuming the standard 5-minute interval:
(0-(2^32+1)+2^64)/300 = 61,489,146,898,048,614 = 61.5 * 10^15
which is 61.5 Peta (not Peto).
Something similar will happen if your counter can go over 2^32
but wraps before 2^64. For instance, a wrap could occur at
9,999,999,999 -> 0
In this case, the counter wrap mechanism gets it wrong. It
isn't perfect and to the best of my knowledge it isn't
configurable either.
The math:
0-9,999,999,999 < 0 -> add 2^32, still < 0, add 2^64-2^32.
Divide by 300 and get 61.5 Peta as well.
Third possibility, probably more likely than the second one
in your case, is a parsing error. If numbers grow really big,
then sometimes spaces are lost. Those spaces may be used by
you to find a certain column. Perhaps you were no longer
looking at bytes transmitted but rather at packets transmitted,
just due to multicast (low number) and bytes transmitted (high
number) were fused together. A variant of this problem would
be when numbers are written in a scientific kind of notation,
and your parser stops too soon: 1E10 is not equal to 1.
HTH
--
Alex van den Bogaerdt
http://www.vandenbogaerdt.nl/rrdtool/
More information about the rrd-users
mailing list