# [rrd-users] Re: Question on 95th percentile

Clifton Royston cliftonr at lava.net
Mon Feb 26 20:24:00 MET 2001

```On Sun, Feb 25, 2001 at 10:54:16PM +0100, Alex van den Bogaerdt wrote:
>
> Hi folks,
>
> Suppose I can create an array with 1234 elements, sort the values
> in this array, I should be able to fetch the 95th percentile from
> the array by selecting the correct element.
>
> However, I need to know two things:
>
> - 95% * 1234 == 1172.30
>   Do I take element 1171 or 1172 in this case? (1172-1, or 1173-1,
>   as the array is zero based)

I believe there are statistical formulae for this; it seems to me I
recall for maximum accuracy you interpolate between the two samples
based on the fraction of the way through that it falls, so you would
take v(1171) + 0.30*(v(1172)-v(1171)).  (Similar to what you do when
calculating the median of an even number of samples, you take halfway
between the middle two values.)

> - If there are unknown entries in the database, it seems to me that
>   these unknowns should be processed in favor of the one paying the
>   bill (correct?)  If so, should I keep them in the array?

This question lies on some hazy boundary between statistics and ethics
and I am not sure I know the best answer.  The two alternatives I see
are: keep the samples in and treat as 0 (sort to bottom of the array),
or count only the number of samples for which you have data stored, and
take the 95%ile of those samples, ignoring those for which there is no
data.
-- Clifton

--
Clifton Royston  --  LavaNet Systems Architect --  cliftonr at lava.net
WWJD?   "JWRTFM!" - Scott Dorsey (kludge)   "JWG" - Eddie Aikau

--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-users