[rrd-users] RRDTool Aggregation Inaccuracies.
s.shipway at auckland.ac.nz
Thu Jul 9 07:41:10 CEST 2009
I suspect this may be the visible result of the statistical fact that although
Avg(a) + Avg(b) == Avg(a+b)
you should note that
Max(a) + Max(b) != Max(a+b)
where a and b are elements iterated over the same time series. I get this problem when graphing CPU usage split into user/system/wait and then trying to take a maxima by summing the maxima of the separate datasources, whereapon I find a value greater than 100% as soon as I start to use an RRA with >1 dp per cdp.
Maybe what I just said was all greek to you :)
As RRDTool rolls up the data to form the weekly etc RRAs, it will summarise the data points. At this point, you can no longer simply sum the maxima because of the second equation above. When you look at the Daily graph, it works, since the CDPs (consolidated data points in the RRA) are formed from just a single DP (data point) and so Max(a)==Avg(a) , which means the sum works.
In summary, its inaccurate because you can't do this sort of calculation with consolodated data. Hope this made sense, I'm full of 'flu germs today and not too clear
More information about the rrd-users