[smokeping-users] Re: RRDTool Filling In Unknown Values For Missing Pings

Niko Tyni ntyni+smokeping-users at mappi.helsinki.fi
Sat Feb 5 13:53:51 MET 2005


Hi Kennedy,

I'm quoting the longish mail; see the comments below.  (BTW, your
mailer is broken, it's using 8-bit quote characters but claiming it's
7-bit ASCII.)

In article <20050106160149.54654.qmail at web54607.mail.yahoo.com>, 
hkclark at yahoo.com wrote:

> # Create a simplified Smokeping-like RRD
> rrdtool create test_01.rrd --start 1000000000 --step
> 300   \
>   DS:loss:GAUGE:600:0:20 DS:ping1:GAUGE:600:0:180  \
>   RRA:AVERAGE:0.5:1:1008 RRA:AVERAGE:0.5:12:4320  \
>   RRA:MIN:0.5:12:4320 RRA:MAX:0.5:12:4320  \
>   RRA:AVERAGE:0.5:144:720 RRA:MAX:0.5:144:720 \
>   RRA:MIN:0.5:144:720 
> # 
> # Load in some dummy data
> rrdtool update test_01.rrd 1000000200:4:5
> rrdtool update test_01.rrd 1000000521:4:5
> rrdtool update test_01.rrd 1000000821:8:9
> rrdtool update test_01.rrd 1000001121:U:5
> rrdtool update test_01.rrd 1000001421:U:5
> # Dump
> rrdtool dump test_01.rrd > test_01.xml
> 
> Note that it's putting in 3 values followed by 2 "U's"
> for the DS "loss".
> 
> The dump shows:
>     Time		loss			ping1
>     1000000200	4.0000000000e+000	5.0000000000e+000
>     1000000500	4.0000000000e+000	5.0000000000e+000
>     1000000800	7.7200000000e+000	8.7200000000e+000
>     1000001100	8.0000000000e+000	5.2800000000e+000
>     1000001400	NaN			5.0000000000e+000
> 
> So, the loss value in the "1000000800" (3rd) row is
> the "equi-spaced points on an interpolated curve"
> issue, right?  I'm OK with that yeah, as you point
> out it's a little "weird" for ping data, but it
> doesn't fundamentally change the results.  However,
> the "1000001100" (4th) row is a little different now
> rather than storing a "NaN" it's taking the
> "un-interpolated value" from the previous timeslot.  

The RRD logic goes like this: in the interval 800-1100, there's 21
seconds of value 4 and the rest (279s) of value 8. This leads to
(21*4+279*8)/300 = 7.72. This is the 'equi-spaced points on an
interpolated curve', yes.

The next interval has 21 seconds of known value 8 and the rest
unknown.  RRDtool discards the unknown and assumes the rest of the
interval equals the known part. This is hard-coded into RRDtool, you
can't change it with the database parameters.

The core of the problem is thus that Smokeping is using NaN as
round-trip-time for the missing pings, but RRDtool considers this as
missing data and prefers the known data over it, so we get the false
smoke.

One way to fix this might be to store the RTT of the missing pings as
the average of the others instead of NaN. That should clean up the
smoke. I haven't tried this, so I could be missing something obvious.
I'm not sure what should be done if all the pings were lost.

Tobi, any ideas?

Cheers,
-- 
niko

--
Unsubscribe mailto:smokeping-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:smokeping-users-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/smokeping-users
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the smokeping-users mailing list