[rrd-users] Using a windowed counter
Donovan Baarda
abo at minkirri.apana.org.au
Wed Jun 14 00:59:40 CEST 2017
A far simpler solution is; it's just a counter that resets every hour.
Provided you set sensible min and max rates, rrd will just see the hourly
reset as a counter reset and record an UNKNOWN rate (NaN) for that 5min
(since rrd cannot know how much rain fell between the reset and the sample
taken before it).
If you want to reduce or eliminate the NaNs you can sample at a faster rate
(at least 2x faster) and set the xff on your rra's to a reasonable 0.5.
This will "average out" the small unknown period using the known periods to
give you a reasonably accurate estimated rate for the 5min period.
On 13 Jun. 2017 8:18 pm, "Alex van den Bogaerdt" <alex at vandenbogaerdt.nl>
wrote:
> > So, I want to use rrdtool to manage my weather station data. I read data
> > in 5 minute intervals, but my station only tells me the rain counter for
> > the last hour. So the number I get is a summation of my rain counter
> > deltas for the last 12 intervals.
> >
> > For instance, suppose in the last 5 hours I get the values the first line
> > below (one char for each 5-min interval), it means that, for every 5
> > minutes, actual rainfall has been as in the second line.
> >
> > 011111244445678888889998767777765555553444335888889899987677
> > 010000120001121000121000012000110000010100012300011010001210
> >
> > Is there a way to specify that the values to store in the RRD would be
> > calculated as:
> >
> > stored[now] = input[now] - input[now - 1] + stored[now - 12]
>
> This is not a complete answer but hopefully it helps you to tackle this
> problem.
>
> Short answer: no, unless something can be done with the COMPUTE DS type,
> which I do not know enough.
>
> RRDtool does not work with values. It works with rates. After processing
> its input, the resulting rate may be further processed. The original input
> is not kept.
>
> This said: your 'values' are actually rates: rainfall in the past hour. It
> probably means you will have to use the GAUGE data source type. And then
> your 'values' are in the database, as rates.
> Make sure you understand rates are <something> per second. Just multiply
> by 3600 if your rates are per hour.
>
> Before anything else:
> You will probably end up in some trial and error. It would be of a very
> big help both to you and to the members of this list to have actual values
> being given to rrdtool, the time that these happened, so that you can
> recreate the same conditions.
> This also means you will have to use real time stamps, not 'now'.
>
> To make things easier, it would be a very good idea to query your weather
> station not just every 5 minutes, but more precise at time stamps which
> are whole multiples of 300 seconds. Thus: 12:05, 12:10, 12:15 and not
> 12:07, 12:12, 12:17. Again this means using real time stamps, not 'now'.
> Read about normalization and consolidation:
> http://rrdtool.vandenbogaerdt.nl/process.php to understand why this helps.
>
> Your graphs should also start and end on nice numbers. That means you will
> have a known number of intervals in your graph. Beware: there have been,
> are, and probably will be off-by-one errors. Sometimes they are fixed,
> sometimes they pop up again. While debugging your solution always keep
> this in mind and modify your times accordingly.
> One example: start of graph is 12:00, end of graph is 13:00, number of
> 5-minute intervals should be 12, but actually was 13 because the interval
> 13:00 to 13:05 was also included. Another time with the same start and end
> times the last interval, 12:55 to 13:00, was not included and I ended up
> with only 11 intervals.
>
> https://en.wikipedia.org/wiki/Off-by-one_error#Fencepost_error
>
> You may need to change your start, or end time for the graph to compensate
> for this.
>
>
> Some ideas to investigate:
>
> * just write a program (C, bash, perl, whatever suits you) that does the
> processing as you described above. Feed the result to RRDtool.
> * use CDEFs with some PREVs and see where that leads you
> * what happens if you just record the data as is, and look at long term
> stats, e.g. one hour per pixel column, after RRDtool has averaged 12
> 5-minute rates into 1 1-hour rate. You can have more than one RRA in your
> database. Define an RRA which collects 12 5-minute intervals per bucket,
> consolidation function AVERAGE.
>
> Some random tips I can think of right now:
>
> * start with an empty database and fill the first 12 time slots with zero.
> This helps when using PREV.
> You can do so by specifying a start time at least one hour before your
> first entry. Then feed rate 0 to RRDtool. Either set heartbeat high enough
> to allow you to do this with a single update rate 0, or actually to 12
> updates 5 minutes apart.
>
> * keep it simple. Your task is hard enough without all those extra
> features. Add those later when so desired. Focus now on getting the
> numbers right.
>
> * make your graphs big. E.g. 400 pixels, showing just 40 slots of 5
> minutes worth of data (--width 400 --end <some timestamp> --start
> end-12000). Are the rates the same as you put in? If not, investigate.
> Logic error, or fencepost problem?
>
> * In your first few tries, send a rate, dump the database, make sure that
> the resulting rate is what you expect. Unless you find a bug (which I
> doubt, at this point for this part in the process) there is an error in
> your reasoning.
>
> * make sure to use http://oss.oetiker.ch/rrdtool/doc/index.en.html and so
> on.
>
> * keep discussions/questions on-list.
>
> HTH
> Alex
>
>
> _______________________________________________
> rrd-users mailing list
> rrd-users at lists.oetiker.ch
> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.oetiker.ch/pipermail/rrd-users/attachments/20170614/58563b53/attachment.html>
More information about the rrd-users
mailing list