[rrd-users] rrd graph inter/extrapolation of large U / NaN blocks

Steve Shipway s.shipway at auckland.ac.nz
Wed Feb 27 21:09:45 CET 2013


A couple of ideas come to mind.  I suspect (2) is the one you'll go for.

1. Use previous values.
   If you do a CDEF in your graphing, like CDEF:newx=x,UN,PREV(x),x,IF then you can graph this, and you'll get the previous value used when it is unknown.  This will result in displaying a horizontal line from the last known value. This is the default action of native-mode MRTG.

2. Use very big heartbeat values in your RRD.
   If you set an extremely large heartbeat, then when you get the first valid data after a long period of nothing, it will interpolate ALL the intervening values (provided the time since the last value is <heartbeat).  Note that updating with 'unknown' is not the same as not updating at all.  I've used something like this to give a continuous graph when data come in very irregularly with potentially big gaps.  The downsides are that this modifies the data as stored in the RRD, so you have no record of when the data are interpolated vs 'real'.  The benefit is that you just have to tweak the RRD file and that's it; no special coding.

3. Use Holt-winters RRAs to fill in.
  This is the cool solution that requires lots of work.  Set up a Holt-Winters RRA set, with a daily period to predict the value of the data.  Then, when X is unknown, use the HWPREDICT RRA.  As time goes by, you'll get more accurate predictions, as long as some times you have data for that time window.
  CDEF:newx=x,UN,hwx,x,IF
This has the benefit of not altering your real data, and giving a convincing pattern for the infill.  The downside is a lot more work, and that the infill accuracy improves with the amount of historical data you collect.  You might want to make a smaller period for the HWPREDICT if you never have data for that time window; this all depends on your data pattern and periodicity.  You'd need to tune the HW params over time.

Steve

Steve Shipway
University of Auckland ITS
UNIX Systems Design Lead
s.shipway at auckland.ac.nz<mailto:s.shipway at auckland.ac.nz>
Ph: +64 9 373 7599 ext 86487

________________________________
From: rrd-users-bounces+s.shipway=auckland.ac.nz at lists.oetiker.ch [rrd-users-bounces+s.shipway=auckland.ac.nz at lists.oetiker.ch] on behalf of Joachim Larsson [joachim.larsson at ericsson.com]
Sent: Thursday, 28 February 2013 2:49 a.m.
To: rrd-users at lists.oetiker.ch
Subject: [rrd-users] rrd graph inter/extrapolation of large U / NaN blocks

Hello,

Im web-scraping a stock-market page which is only open daytime, thus my rrd-database will have huge gaps in the data. I've been trying to make a CDEF that uses old values and tries to inter/extrapolate the gaps, but to no avail.

Can anyone shed some light regarding converting U to a straight line between the previous and last successful rrd data?

I create the rrd with the following;
rrdtool create stocklol.rrd --start $starttime -s 300\
DS:latest:GAUGE:600:0:U \
DS:volume:COUNTER:600:0:U \
RRA:AVERAGE:0.5:1:8640 \
RRA:AVERAGE:0.5:4:52560

The data is being updated 0800 -> 1800 CET, all other values are NAN / U

Google and other sources doesnt hint on this, as far as i could tell. Alot of refrences to making unknown into 0, but that's absolutely not what i want. I checked the CDEF and RPN docs refrenced in rrdgraph pages.

Sincerely,
Joachim



-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.oetiker.ch/pipermail/rrd-users/attachments/20130227/d656afa3/attachment-0001.htm 


More information about the rrd-users mailing list