[rrd-users] Re: Median calculations ?
Ralf Kruedewagen
ralf-buero at kruedewagen.de
Wed Mar 15 10:25:04 MET 2006
Thanks for your answers ! You put me to the right direction.
I have tested with OpenOffice Calc (I have no Excel ;-), and it seems
that the 50th percentile is in deed equal to the median.
The OpenOffice help says:
-------------------
PERCENTILE
Returns the alpha-percentile of data values in an array. A percentile
returns the scale value for a data series which goes from the
smallest (Alpha=0) to the largest value (alpha=1) of a data series.
For Alpha = 25%, the percentile means the first quartile; Alpha = 50%
is the MEDIAN.
Syntax
PERCENTILE(Data;Alpha)
Data represents the array of data.
Alpha represents the percentage of the scale between 0 and 1.
Example
=PERCENTILE(A1:A50; 0.1) represents the value in the data set, which
equals 10% of the total data scale in A1:A50.
-------------------
Since I am using Cacti as frontend to RRDTool, I have found out that
Cacti has a percentile build-in function. So, I can calculate the
median by e.g. |50:bytes:0:current:2|. The extra function in Cacti
seems to be necessary, because Cacti currently supports no VDEF.
With RRDTool I also checked the PERCENT feature. I verified (and it's
documented) that RRDTool considers all NaN values when calculating
the percentile. That's bad and gives "wrong" results in cases where I
have a significant number of NaN values in my database (which happens
very often). I don't want consider NaN values, maybe that can be
fixed in RRTool.
Cacti ignores NaN values for the percentile calculations. That's
better (even for me).
Thanks
Ralf
On Wednesday 15 March 2006 09:41, Alex Prinsier wrote:
> I see I made a mistake in my previous mail, the median should be 3
> of course not 2 (forgot to sort ;)).
>
> How exactly do you define the nth percentile? When you have an
> uneven number of elements and you take the 50th percentile, do you
> leave of (n-1)/2 elements or the (n+1)/2 elements? I believe that
> you leave of (n-1)/2 elements in rrdtool's implementation. I'm not
> sure which one is more correct...
>
> So the relation with the median: if there are an uneven number of
> elements you give the array[(n-1)/2] element which is the median.
>
> If there are an even number of elements you also give
> array[(n-1)/2] but the median would be the average between
> array[(n-1)/2] and array[(n-1)/2].
>
> I'd say that for a median you can use the 50th percentile, it's a
> good approximation :)
>
> Alex
>
> --
> Unsubscribe
> mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe Help
> mailto:rrd-users-request at list.ee.ethz.ch?subject=help Archive
> http://lists.ee.ethz.ch/rrd-users
> WebAdmin http://lists.ee.ethz.ch/lsg2.cgi
--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive http://lists.ee.ethz.ch/rrd-users
WebAdmin http://lists.ee.ethz.ch/lsg2.cgi
More information about the rrd-users
mailing list