[rrd-users] Using a windowed counter

Alex van den Bogaerdt alex at vandenbogaerdt.nl
Tue Jun 13 12:17:50 CEST 2017


> So, I want to use rrdtool to manage my weather station data. I read data
> in 5 minute intervals, but my station only tells me the rain counter for
> the last hour. So the number I get is a summation of my rain counter
> deltas for the last 12 intervals.
>
> For instance, suppose in the last 5 hours I get the values the first line
> below (one char for each 5-min interval), it means that, for every 5
> minutes, actual rainfall has been as in the second line.
>
> 011111244445678888889998767777765555553444335888889899987677
> 010000120001121000121000012000110000010100012300011010001210
>
> Is there a way to specify that the values to store in the RRD would be
> calculated as:
>
> stored[now] = input[now] - input[now - 1] + stored[now - 12]

This is not a complete answer but hopefully it helps you to tackle this
problem.

Short answer: no, unless something can be done with the COMPUTE DS type,
which I do not know enough.

RRDtool does not work with values. It works with rates. After processing
its input, the resulting rate may be further processed. The original input
is not kept.

This said: your 'values' are actually rates: rainfall in the past hour. It
probably means you will have to use the GAUGE data source type. And then
your 'values' are in the database, as rates.
Make sure you understand rates are <something> per second. Just multiply
by 3600 if your rates are per hour.

Before anything else:
You will probably end up in some trial and error. It would be of a very
big help both to you and to the members of this list to have actual values
being given to rrdtool, the time that these happened, so that you can
recreate the same conditions.
This also means you will have to use real time stamps, not 'now'.

To make things easier, it would be a very good idea to query your weather
station not just every 5 minutes, but more precise at time stamps which
are whole multiples of 300 seconds.  Thus: 12:05, 12:10, 12:15 and not
12:07, 12:12, 12:17. Again this means using real time stamps, not 'now'.
Read about normalization and consolidation:
http://rrdtool.vandenbogaerdt.nl/process.php to understand why this helps.

Your graphs should also start and end on nice numbers. That means you will
have a known number of intervals in your graph. Beware: there have been,
are, and probably will be off-by-one errors. Sometimes they are fixed,
sometimes they pop up again. While debugging your solution always keep
this in mind and modify your times accordingly.
One example: start of graph is 12:00, end of graph is 13:00, number of
5-minute intervals should be 12, but actually was 13 because the interval
13:00 to 13:05 was also included. Another time with the same start and end
times the last interval, 12:55 to 13:00, was not included and I ended up
with only 11 intervals.

https://en.wikipedia.org/wiki/Off-by-one_error#Fencepost_error

You may need to change your start, or end time for the graph to compensate
for this.


Some ideas to investigate:

* just write a program (C, bash, perl, whatever suits you) that does the
processing as you described above. Feed the result to RRDtool.
* use CDEFs with some PREVs and see where that leads you
* what happens if you just record the data as is, and look at long term
stats, e.g. one hour per pixel column, after RRDtool has averaged 12
5-minute rates into 1 1-hour rate. You can have more than one RRA in your
database. Define an RRA which collects 12 5-minute intervals per bucket,
consolidation function AVERAGE.

Some random tips I can think of right now:

* start with an empty database and fill the first 12 time slots with zero.
This helps when using PREV.
You can do so by specifying a start time at least one hour before your
first entry. Then feed rate 0 to RRDtool. Either set heartbeat high enough
to allow you to do this with a single update rate 0, or actually to 12
updates 5 minutes apart.

* keep it simple. Your task is hard enough without all those extra
features. Add those later when so desired. Focus now on getting the
numbers right.

* make your graphs big. E.g. 400 pixels, showing just 40 slots of 5
minutes worth of data (--width 400 --end <some timestamp> --start
end-12000). Are the rates the same as you put in? If not, investigate.
Logic error, or fencepost problem?

* In your first few tries, send a rate, dump the database, make sure that
the resulting rate is what you expect. Unless you find a bug (which I
doubt, at this point for this part in the process) there is an error in
your reasoning.

* make sure to use http://oss.oetiker.ch/rrdtool/doc/index.en.html and so on.

* keep discussions/questions on-list.

HTH
Alex




More information about the rrd-users mailing list