[rrd-users] Absolute values..

Alex van den Bogaerdt alex at vandenbogaerdt.nl
Mon Nov 17 23:01:18 CET 2008


----- Original Message ----- 
From: "Gavin Landon" <Gavin.Landon at ignitetech.com>
To: <rrd-users at lists.oetiker.ch>
Sent: Monday, November 17, 2008 9:16 PM
Subject: Re: [rrd-users] Absolute values..


> Ok, maybe that's why some of the numbers are off..
>
> Here is one that is the simplest I can make it and let me know, if this
> is the "Normalization" part that I'm just not understanding at the
> moment.
> http://www.chizl.com/rrd/test.png
> http://www.chizl.com/rrd/RRDNumbersOff.zip
>
> There is never a number larger than "1" put in and most of them are "0".
> However the graft goes up to 1, but seems to taper before and after
> (only in some parts) as if there are decimals less than 1 in the data.


You are updating at "non-border" timestamps.

Example numbers and times:
previously the rate was 0
hh:mm:30   You are telling RRDtool the rate was 0 between this update and 
the previous.
hh:mm+1:30 You are telling RRDtool the rate was 1 between this update and 
the previous.
hh:mm+2:30 You are telling RRDtool the rate was 1 between this update and 
the previous.
hh:mm+3:30 You are telling RRDtool the rate was 0 between this update and 
the previous.
hh:mm+4:30 You are telling RRDtool the rate was 0 between this update and 
the previous.

Several slots need to be filled:

a) upto hh:mm:00, which is the slot timestamped hh:mm:00
b) hh:mm:00 .. hh:mm+1:00, which is the slot timestamped hh:mm+1:00
c) hh:mm+1:00 .. hh:mm+2:00, and so on
d) hh:mm+2:00 .. hh:mm+3:00, end time is used
e) hh:mm+3:00 .. hh:mm+4:00, slot timestamped hh:mm+4:00

Simple cases for a, c and e: the rate is constant during the entire slot.
More complex are b and d, the rate is not constant during the interval.

I've chosen the update timestamps at exactly halfway a time slot, which 
makes calculations a bit more easy to follow.

In interval b, the rate starts as 0 but is changed to 1 halfway the 
interval. Remember: numbers are "x per second".  There are 30 seconds in 
this slot at rate 0 (30 times 0 per second) and 30 seconds at rate 1 (30 
times 1 per second). The total for this slot is thus: (30 times 0 + 30 times 
1) during 60 seconds. This means *on*average* you had 0.5 per second.

In interval d the same outcome exists, only now the first half is at rate 1. 
Still, (30*1+30*0)/60 is also 0.5 per second.


The resulting rates:

a) 0.0
b) 0.5
c) 1.0
d) 0.5
e) 0.0

This is what will be visible in the graph, you were wondering where the 0.5 
would come from.

Now compute a total amount per time slot:

a) 0.0 times 60 = 0
b) 0.5 times 60 = 30
c) 1.0 times 60 = 60
d) 0.5 times 60 = 30
e) 0.0 times 60 = 0

Let's see if that's correct...

if T = hh:mm:00, then:


you update at T+30 with rate 0.  Before this, the rate was also 0 (a given).
you update at T+90 with rate 1.  Between T+30 and T+90 the rate was 1.
you update at T+150 with rate 1.  Between T+90 and T+150 the rate was 1.
you update at T+210 with rate 0.  Between T+150 and T+210 the rate was 0. 
And it stays 0.

Thus: the rate was 1 per second during (150-30=)120 seconds.  That's a total 
of 120.  And indeed: 30+60+30 equals 120.


This is the moment where you are going to say that you are dealing with 
integer numbers and cannot have half a connection, half a person, and so on.

Simple example: say that a website has only 24 visitors a day.  That is, on 
average, 1 visitor per hour.  But what if the website has only 12 visitors a 
day?  Yes, that is on average 0.5 visitor per hour, despite that visitors 
are either present or not and you will never encounter half a visit.

If you do not want to see fractions, you need to forget about averages.  Use 
max, or min.  The rates from a couple of tables ago would be:

The resulting rates if using min:

a) 0.0  ->  0.0
b) 0.5  ->  0.0
c) 1.0  ->  1.0
d) 0.5  ->  0.0
e) 0.0  ->  0.0

The resulting rates if using max:

a) 0.0  ->  0.0
b) 0.5  ->  1.0
c) 1.0  ->  1.0
d) 0.5  ->  1.0
e) 0.0  ->  0.0

When doing this you may surprise yourself again once you look at the RRAs 
which store more time per entry.  Effectively you are monitoring "did I see 
an update during this interval" (if using max) or "has there been a minute 
without an update during this interval" (if using min).  The resulting rate 
will be biased to 1 (for max) or 0 (for min).

Once you really understand what you are doing, you will modify your setup so 
that you store "x per second" instead of "x" (which you seem to be trying 
right now).



More information about the rrd-users mailing list