[rrd-users] Simple questions about RRD
Simon Hobson
linux at thehobsons.co.uk
Mon Dec 7 20:46:57 CET 2009
Jean-Yves Avenard wrote:
>It shows the maximum as being 2972W ; which is indeed the maximum with
>the actual data at the highest resolution.
>But it's definitely not the maximum of 5 minutes average as the graph
>shows: nothing is over 2200W
>
>That graph is created with :
>
>$ret = exec("$RRDTOOL graph $name -l 0 \
>-t '$title' \
>-x $legend \
>--step $res --start e-$start --end $timestamp \
>-w $width -h $height \
>DEF:total=currentcost.rrd:total:AVERAGE DEF:ch2=currentcost.rrd:ch2:AVERAGE \
>DEF:solar=solarprod.rrd:total:AVERAGE DEF:ch1=currentcost.rrd:ch1:AVERAGE \
>DEF:totalmin=currentcost.rrd:total:MIN:reduce=AVERAGE
>DEF:ch2min=currentcost.rrd:ch2:MIN:reduce=AVERAGE \
>DEF:solarmin=solarprod.rrd:total:MIN:reduce=AVERAGE
>DEF:ch1min=currentcost.rrd:ch1:MIN:reduce=AVERAGE \
>DEF:totalmax=currentcost.rrd:total:MAX:reduce=AVERAGE
>DEF:ch2max=currentcost.rrd:ch2:MAX:reduce=AVERAGE \
>DEF:solarmax=solarprod.rrd:total:MAX:reduce=AVERAGE
>DEF:ch1max=currentcost.rrd:ch1:MAX:reduce=AVERAGE \
Why are you using reduce=AVERAGE on a maximum ?
Looking back at your first post, it looks like your DS is at 60s
intervals, therefore a graphing interval of 300s will require further
consolidation. The data in the graphs is quite peaky, so this is
likely to have a significant effect - eg at around time 21.5 there is
a quite a narrow peak and it's quite likely that the average of 5 max
values is considerably less than the max of 5 values. For example, it
could be 4 samples of 0.6k and one of 5.6k, or it could be 5 samples
of 1.6k - quite a difference.
Having said that, I can't see where the difference is coming from,
unless the VDEF is using the raw data before it's been consolidated -
someone would have to look at the source to see if that's possible.
I will say that it can make things a lot easier to see if you create
a file of test data - carefully chosen to exercise the functionality
in doubt. So perhaps something like this :
1260214800:1
1260214860:1
1260214920:6
1260214980:1
1260215040:1
1260215100:1
Now that should produce the following values when consolidated over
the 5 minutes from 1260214800 to 1260215100 :
min = 1
ave = 2
max = 6
>Now I have an extra question on how the RRA/consolidation works...
>
>Say I store 5 years of 5 minutes average ; is there any points of also
>storing 5 years of 30 minutes average, 5 years of 2 hours average etc?
>
>I would have assumed that it wouldn't matter ; except when retrieving
>the 30 minutes/2 hours average extra processing is required as it
>needs to retrieve more values.
>
>Now; if I use reduce=AVERAGE for all my graph like
>DEF:total=currentcost.rrd:total:AVERAGE DEF:ch2=currentcost.rrd:ch2:AVERAGE \
>DEF:solar=solarprod.rrd:total:AVERAGE DEF:ch1=currentcost.rrd:ch1:AVERAGE \
>DEF:totalmin=currentcost.rrd:total:MIN:reduce=AVERAGE
>DEF:ch2min=currentcost.rrd:ch2:MIN:reduce=AVERAGE \
>DEF:solarmin=solarprod.rrd:total:MIN:reduce=AVERAGE
>DEF:ch1min=currentcost.rrd:ch1:MIN:reduce=AVERAGE \
>DEF:totalmax=currentcost.rrd:total:MAX:reduce=AVERAGE
>DEF:ch2max=currentcost.rrd:ch2:MAX:reduce=AVERAGE \
>DEF:solarmax=solarprod.rrd:total:MAX:reduce=AVERAGE
>DEF:ch1max=currentcost.rrd:ch1:MAX:reduce=AVERAGE \
>
>Do I even need to create the RRD with MIN and MAX ? or it can be all
>done from the AVERAGE RRD database ?
>In my case, I create the RRD with:
>100 day of 1 minute average
>5 years of 5 minutes average
>
>rrdtool create currentcost.rrd -s 60 \
>DS:total:GAUGE:300:0:U \
>DS:ch1:GAUGE:300:0:U \
>DS:ch2:GAUGE:300:0:U \
>DS:ch3:GAUGE:300:0:U \
>RRA:AVERAGE:0.5:1:144000 \
>RRA:AVERAGE:0.5:5:525600 \
>RRA:MIN:0.5:1:144000 \
>RRA:MIN:0.5:5:525600 \
>RRA:MAX:0.5:1:144000 \
>RRA:MAX:0.5:5:525600
>
>Not creating a MIN and MAX one would reduce considerably the size rrd file.
You seem to have some very woolly thinking there. If you only store
average, then you cannot possibly derive min and max from
consolidated data. Eg, suppose someone boils the kettle and it takes
3kW for one minute (and it happens to coincide nicely with one sample
period). You could have a 5 minute period with values of 3,0,0,0,0
which average to 0.6. There is no way to know the minimum other than
it's <= 0.6, and no way to know the max other than it's >= 0.6. In
this case they are 0 and 3, but they could just as well be 0.6 if it
was a steady load.
Extend that to 30 minute and 2 hour consolidations and the difference
widens - eg you boil the kettle but otherwise use no power and the
max is still 3, the min is still 0, but the average is now only 0.1
and 0.025 respectively.
Now, if you are only interested in 5 minute smoothed data and NOT 1
minute min and max, then you are correct that you don't need to keep
separate min and max for a ds - provided you do not keep or try to
graph data from any consolidation of that.
There are two main reasons people use consolidated storage :
1) To reduce storage requirements. Most people aren't bothered by the
fine detail once it's moderately aged. So for example, at work I keep
5 minute samples for only 2 days, I keep 1/2 hour consolidations for
longer, 2 hour consolidations for longer still, and 1 day
consolidations for 2 years. So I can graph in detail, or over a long
time, but not both - I can't plot a details graph for data a year old
and that's fine by us.
2) Reduce processing to generate graphs. Naturally, if you plot a
year long graph with 5 minute samples then the graphing program has
to read in and consolidate a lot of data - and that takes time and
memory. A while ago I realised I'd made a mistake and was storing 12
hour data rather than 24 hour data - and graphing at 24 hour
resolution. I found that re-working things significantly decreased
runtime and memory requirements - particularly on the complex graphs
with hundreds of data sources in them.
--
Simon Hobson
Visit http://www.magpiesnestpublishing.co.uk/ for books by acclaimed
author Gladys Hobson. Novels - poetry - short stories - ideal as
Christmas stocking fillers. Some available as e-books.
More information about the rrd-users
mailing list