[rrd-users] Discrepancy between RRD Average and hand-rolledAverage

Ruttenberg, Tanya Tanya.Ruttenberg at ssa.gov
Thu Jun 19 13:42:22 CEST 2008

1) Re: Turning NaN's to zero. 

Nope.  I was deleting them altogether and then using a simple mean
function on the remaining array.

2) Inconsistent start and end times.

Turns out this was a problem.  The differing "now" times got past me.
Moreover it turns out in one of the scripts my end time was 18:00.  So
start/end time in one script was "now-5 days"/18:00 and in the other
script start/end times were "now-5 days"/now.  Even more out of what
than if I had used "now" as an end time in both scripts.

Once I corrected this the values were still not exactly the same, but
they were pretty darn close:

RRD VDEF: device1/device2: 133650.557913/46929.565915
hand-rolled average: device1/device2: 135472.37/47663.41

More importantly to me at this point, the ratio of one to the other in
both results is almost exactly the same. 

Thank you for the extensive attention you gave to my question Alex.

Tanya Ruttenberg


-----Original Message-----
From: rrd-users-bounces at lists.oetiker.ch
[mailto:rrd-users-bounces at lists.oetiker.ch] On Behalf Of Alex van den
Sent: Thursday, June 19, 2008 4:25 AM
To: rrd-users at lists.oetiker.ch
Subject: Re: [rrd-users] Discrepancy between RRD Average and

On Wed, Jun 18, 2008 at 08:10:51AM -0400, Ruttenberg, Tanya wrote:
> I have a script that fetches RRD data for the last 5 days.  I take the

> average of that data (it happens to be inbound utilization) in the 
> usual
> way: sum the data and divide by the number of pieces of data.  The 
> result I get is 95097.03.
> I have another script that uses rrdtool graph and VDEF and PRINT to 
> get the average of the data from **the same rrdfile**:
> my ($result, $b, $c) = RRDs::graph("/dev/null", "--start",
>    "--end", "$endtime", "DEF:ds0=${file}:$inDS:AVERAGE", 
> "VDEF:ds1=ds0,AVERAGE", "PRINT:ds1:%20lf"
> );
> my $inAve1 = $$result[0];
> $inAve1 =~ s/\s//g;
> The result I get from this is quite different than the other one:
> 119465.02
> Any idea why there might be a discrepancy? I use the same start (now -

> 5
> days) and end time (now) for each

The difference will be in NaN entries.  But first a comment on that last
line of you...

If you use 'now', then chances are *very* slim that you run two commands
with the same time.  You would have to start both programs
*exactly* at the same time, which is hard if not impossible to achieve.

Back to the problem:  NaN entries.  At least that is what I suspect.

the source:
            for (step=0;step<steps;step++) {
                if (finite(data[step*src->ds_cnt])) {
                    sum += data[step*src->ds_cnt];
                    cnt ++;

in english: only add finite entries, and keep count of them.
Later it divides sum by cnt:

            if (dst->vf.op == VDEF_TOTAL) {
                dst->vf.val = sum * src->step;
                dst->vf.when = 0;        /* no time component */
            } else {
                dst->vf.val = sum/cnt;
                dst->vf.when = 0;        /* no time component */

According to what you wrote, you are adding all data (presumably you
alter unknown into zero) and divide by the total amount of time.

These computations

total/time_including_nan =  95097.03
total/time_excluding_nan = 119465.02

We know: time_including_nan = 5 days = 432000 seconds.

We know: total/432000 = 95097.03, therefore total has to be
95,097.03 * 432,000 = 41,081,916,960.00

The real total will be about 50GB in 5 days thus 10GB a day.

Now if 41,081,916,960.00 / time_excluding_nan = 119,465.02, then
time_excluding_nan will have to be 343,882.

We now know:
time_including_nan    = 432000
time_excluding_nan    = 343882
and compute: time_nan =  88118

This means about 100%*88118/432000= 20.4% of your entries are NaN.

Quick check:

VDEF uses 4 out of 5 entries. Your computations use 5 out of 5.
Your calculations will divide by a number 5/4 too high, meaning your
result will be 4/5 of reality. 4/5 * 120,000 is 480,000/5 =
48,000*2 = 96,000.  Close enough to 95097.03; check.

Please let me know if this hunch was correct.

Alex van den Bogaerdt

rrd-users mailing list
rrd-users at lists.oetiker.ch

More information about the rrd-users mailing list