[rrd-users] Re: Median calculations ?
ralf-buero at kruedewagen.de
Wed Mar 15 10:25:04 MET 2006
Thanks for your answers ! You put me to the right direction.
I have tested with OpenOffice Calc (I have no Excel ;-), and it seems
that the 50th percentile is in deed equal to the median.
The OpenOffice help says:
Returns the alpha-percentile of data values in an array. A percentile
returns the scale value for a data series which goes from the
smallest (Alpha=0) to the largest value (alpha=1) of a data series.
For Alpha = 25%, the percentile means the first quartile; Alpha = 50%
is the MEDIAN.
Data represents the array of data.
Alpha represents the percentage of the scale between 0 and 1.
=PERCENTILE(A1:A50; 0.1) represents the value in the data set, which
equals 10% of the total data scale in A1:A50.
Since I am using Cacti as frontend to RRDTool, I have found out that
Cacti has a percentile build-in function. So, I can calculate the
median by e.g. |50:bytes:0:current:2|. The extra function in Cacti
seems to be necessary, because Cacti currently supports no VDEF.
With RRDTool I also checked the PERCENT feature. I verified (and it's
documented) that RRDTool considers all NaN values when calculating
the percentile. That's bad and gives "wrong" results in cases where I
have a significant number of NaN values in my database (which happens
very often). I don't want consider NaN values, maybe that can be
fixed in RRTool.
Cacti ignores NaN values for the percentile calculations. That's
better (even for me).
On Wednesday 15 March 2006 09:41, Alex Prinsier wrote:
> I see I made a mistake in my previous mail, the median should be 3
> of course not 2 (forgot to sort ;)).
> How exactly do you define the nth percentile? When you have an
> uneven number of elements and you take the 50th percentile, do you
> leave of (n-1)/2 elements or the (n+1)/2 elements? I believe that
> you leave of (n-1)/2 elements in rrdtool's implementation. I'm not
> sure which one is more correct...
> So the relation with the median: if there are an uneven number of
> elements you give the array[(n-1)/2] element which is the median.
> If there are an even number of elements you also give
> array[(n-1)/2] but the median would be the average between
> array[(n-1)/2] and array[(n-1)/2].
> I'd say that for a median you can use the 50th percentile, it's a
> good approximation :)
> mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe Help
> mailto:rrd-users-request at list.ee.ethz.ch?subject=help Archive
> WebAdmin http://lists.ee.ethz.ch/lsg2.cgi
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:rrd-users-request at list.ee.ethz.ch?subject=help
More information about the rrd-users