[rrd-users] Re: Compromised value after NaN

Alex van den Bogaerdt alex at ergens.op.het.net
Fri Dec 1 15:26:00 MET 2006


On Fri, Dec 01, 2006 at 01:11:38PM +0100, Markus Wiget wrote:

> > Use timestamps, not "N".
> 
> What's wrong when using "N"? I just read in the man page that it is
> legal. Anyway, if I use explicit update time by replacing

So what?  Did I say "It is illegal to use 'N'" ?  No, I didn't.

Here's why I asked:

* Create rrdnan.rrd
* Feed data
(1 / 14:56:37) Updated N:100
(2 / 14:56:42) Updated N:100
(3 / 14:56:47) Updated N:100
(4 / 14:56:52) Updated N:100
(5 / 14:56:57) Updated N:100
(6 / 14:57:02) NOP
(7 / 14:57:07) Updated N:100
(8 / 14:57:12) Updated N:100
(9 / 14:57:17) Updated N:100
(10 / 14:57:22) Updated N:100
* Fetch data
                         testli

1164981385: nan
1164981390: nan
1164981395: nan
1164981400: 8.6187566667e+01
1164981405: 1.0000000000e+02
1164981410: 1.0000000000e+02
1164981415: 1.0000000000e+02
1164981420: nan
1164981425: nan
1164981430: 1.1806045000e+02
1164981435: 1.0000000000e+02
1164981440: 1.0000000000e+02
1164981445: nan
* Finished

As you can see, I get different values, because we used different
times.  That happens because you use 'N' !

(side note: 1.1806045000e+02 seems a weird value. I am going to look into this)


> Anyway, if I use explicit update time by replacing
>     s_date="N"
> with
>     s_date="$(date '+%s')"
> then I get this output:
> 
>     [...]
>     1164967125: nan
>     1164967130: nan
>     1164967135: 1.0000000000e+02

Different result again.  Part of the problem seems to be that
"N" and "$(date +%s)" aren't the same.  The first one uses subsecond
precision, the second one does not.  So "N" at 12:34:56 is not really
12:34:56, it is something like 12:34:56.121125479

Anyway, this modified script should produce equal results for all of
us, no matter when run:

    sdate=1164981600

    echo '* Create rrdnan.rrd'
    rrdtool create rrdnan.rrd --step=5 \
        --start 1164981600 \
        DS:testli:GAUGE:7:0:100 \
        RRA:AVERAGE:0.5:1:16

    echo '* Feed data'
    declare -i  i
    declare     s_date
    for i in $(seq 1 10); do
        # no need to sleep 5
        sdate=$((sdate+5))
        if [ ${i} -ne 6 ]; then
            rrdtool update rrdnan.rrd --template=testli ${sdate}:100
            echo "(${i}  Updated ${sdate}:100"
        else
            echo "(${i}  NOP"
        fi
    done

    echo '* Fetch data'
    rrdtool fetch rrdnan.rrd AVERAGE --start=1164981600 --end=start+60

    echo '* Finished'

It still produces two NaNs:
1164981625: 1.0000000000e+02
1164981630: nan
1164981635: nan
1164981640: 1.0000000000e+02

 
> #2 Why are there two NaN's?

> I thought it is because RRD always requires two values to build a row:
> the current and the previous value. This would mean for the first NaN,
> the current is missing and for the second NaN the previous is missing.
> But probably I'm totally wrong.

Two values are needed when RRDtool needs start and end of an interval,
and compute a rate from it.  This is the case for a COUNTER.  However,
for GAUGE, this is not needed.

This said: an unknown interval can be longer than just one interval.
In our current script, the amount of time for the known data at timestamp
1164981635 is unknown, therefore the interval 1164981630..1164981635 is
also unknown.  Makes sense after all.

If an explicit NaN update at time 1164981630 would have occured, the
rate 1164981625..1164981630 would have been unknown but last update
time at 1164981630 would have been known.  The update at 1164981635
would then have succeeded, producing rate 100 in stead of NaN.


You have "simplified the discussion" by altering heartbeat.  What you
have really done is hide the truth: for timestamps _*exactly*_ equal
to a whole number times step, and with a heartbeat twice as large as
the step size (or more), there is _*no*_ gap:

1164981615: 1.0000000000e+02
1164981620: 1.0000000000e+02
1164981625: 1.0000000000e+02
1164981630: 1.0000000000e+02
1164981635: 1.0000000000e+02

All what's needed is to alter heartbeat 7 back into 10 (read: twice
the step) and stop using subsecond precision.

The update at 1164981635 is not more than ${heartbeat} seconds after
1164981625, thus it is valid, thus its rate is used in stead of NaN.

HTH
-- 
Alex van den Bogaerdt
http://www.vandenbogaerdt.nl/rrdtool/

--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://lists.ee.ethz.ch/rrd-users
WebAdmin    http://lists.ee.ethz.ch/lsg2.cgi



More information about the rrd-users mailing list