[rrd-users] Discrepancy between RRD Average and hand-rolled Average

Alex van den Bogaerdt alex at ergens.op.het.net
Thu Jun 19 10:25:18 CEST 2008


On Wed, Jun 18, 2008 at 08:10:51AM -0400, Ruttenberg, Tanya wrote:
> I have a script that fetches RRD data for the last 5 days.  I take the
> average of that data (it happens to be inbound utilization) in the usual
> way: sum the data and divide by the number of pieces of data.  The
> result I get is 95097.03.
> 
> I have another script that uses rrdtool graph and VDEF and PRINT to get
> the average of the data from **the same rrdfile**:
> 
> my ($result, $b, $c) = RRDs::graph("/dev/null", "--start", "$starttime",
>    "--end", "$endtime", "DEF:ds0=${file}:$inDS:AVERAGE",
> "VDEF:ds1=ds0,AVERAGE", "PRINT:ds1:%20lf"
> );
> my $inAve1 = $$result[0];
> $inAve1 =~ s/\s//g;
> 
> The result I get from this is quite different than the other one:
> 119465.02
> 
> Any idea why there might be a discrepancy? I use the same start (now - 5
> days) and end time (now) for each

The difference will be in NaN entries.  But first a comment on that
last line of you...

If you use 'now', then chances are *very* slim that you run two
commands with the same time.  You would have to start both programs
*exactly* at the same time, which is hard if not impossible to achieve.


Back to the problem:  NaN entries.  At least that is what I suspect.

the source:
            for (step=0;step<steps;step++) {
                if (finite(data[step*src->ds_cnt])) {
                    sum += data[step*src->ds_cnt];
                    cnt ++;
                };
            }

in english: only add finite entries, and keep count of them.
Later it divides sum by cnt:

            if (dst->vf.op == VDEF_TOTAL) {
                dst->vf.val = sum * src->step;
                dst->vf.when = 0;        /* no time component */
            } else {
                dst->vf.val = sum/cnt;
                dst->vf.when = 0;        /* no time component */
            };



According to what you wrote, you are adding all data (presumably you
alter unknown into zero) and divide by the total amount of time.

These computations

total/time_including_nan =  95097.03
total/time_excluding_nan = 119465.02

We know: time_including_nan = 5 days = 432000 seconds.

We know: total/432000 = 95097.03, therefore total has to be
95,097.03 * 432,000 = 41,081,916,960.00

The real total will be about 50GB in 5 days thus 10GB a day.


Now if 41,081,916,960.00 / time_excluding_nan = 119,465.02, then
time_excluding_nan will have to be 343,882.

We now know:
time_including_nan    = 432000
time_excluding_nan    = 343882
------------------------------
and compute: time_nan =  88118

This means about 100%*88118/432000= 20.4% of your entries are NaN.

Quick check:

VDEF uses 4 out of 5 entries. Your computations use 5 out of 5.
Your calculations will divide by a number 5/4 too high, meaning
your result will be 4/5 of reality. 4/5 * 120,000 is 480,000/5 =
48,000*2 = 96,000.  Close enough to 95097.03; check.


Please let me know if this hunch was correct.

-- 
Alex van den Bogaerdt
http://www.vandenbogaerdt.nl/rrdtool/



More information about the rrd-users mailing list