[rrd-users] Re: Median calculations ?

Ralf Kruedewagen ralf-buero at kruedewagen.de
Wed Mar 15 10:25:04 MET 2006


Thanks for your answers ! You put me to the right direction.

I have tested with OpenOffice Calc (I have no Excel ;-), and it seems 
that the 50th percentile is in deed equal to the median.

The OpenOffice help says:
-------------------
PERCENTILE
Returns the alpha-percentile of data values in an array. A percentile 
returns the scale value for a data series which goes from the 
smallest (Alpha=0) to the largest value (alpha=1) of a data series. 
For Alpha = 25%, the percentile means the first quartile; Alpha = 50% 
is the MEDIAN.

Syntax
PERCENTILE(Data;Alpha)
Data represents the array of data.
Alpha represents the percentage of the scale between 0 and 1.

Example
=PERCENTILE(A1:A50; 0.1) represents the value in the data set, which 
equals 10% of the total data scale in A1:A50.
-------------------

Since I am using Cacti as frontend to RRDTool, I have found out that 
Cacti has a percentile build-in function. So, I can calculate the 
median by e.g. |50:bytes:0:current:2|. The extra function in Cacti 
seems to be necessary, because Cacti currently supports no VDEF.

With RRDTool I also checked the PERCENT feature. I verified (and it's 
documented) that RRDTool considers all NaN values when calculating 
the percentile. That's bad and gives "wrong" results in cases where I 
have a significant number of NaN values in my database (which happens 
very often). I don't want consider NaN values, maybe that can be 
fixed in RRTool.

Cacti ignores NaN values for the percentile calculations. That's 
better (even for me).

Thanks
Ralf


On Wednesday 15 March 2006 09:41, Alex Prinsier wrote:
> I see I made a mistake in my previous mail, the median should be 3
> of course not 2 (forgot to sort ;)).
>
> How exactly do you define the nth percentile? When you have an
> uneven number of elements and you take the 50th percentile, do you
> leave of (n-1)/2 elements or the (n+1)/2 elements? I believe that
> you leave of (n-1)/2 elements in rrdtool's implementation. I'm not
> sure which one is more correct...
>
> So the relation with the median: if there are an uneven number of
> elements you give the array[(n-1)/2] element which is the median.
>
> If there are an even number of elements you also give
> array[(n-1)/2] but the median would be the average between
> array[(n-1)/2] and array[(n-1)/2].
>
> I'd say that for a median you can use the 50th percentile, it's a
> good approximation :)
>
> Alex
>
> --
> Unsubscribe
> mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe Help  
>      mailto:rrd-users-request at list.ee.ethz.ch?subject=help Archive 
>    http://lists.ee.ethz.ch/rrd-users
> WebAdmin    http://lists.ee.ethz.ch/lsg2.cgi

--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://lists.ee.ethz.ch/rrd-users
WebAdmin    http://lists.ee.ethz.ch/lsg2.cgi



More information about the rrd-users mailing list