<br><font size=2 face="sans-serif">See this link for a close look into

the problem:</font>

<br>

<br><a href="https://lists.oetiker.ch/pipermail/rrd-users/2008-January/013582.html"><font size=2 face="sans-serif">https://lists.oetiker.ch/pipermail/rrd-users/2008-January/013582.html</font></a>

<br><font size=2 face="sans-serif"><br>

Joe Loiacono<br>

</font>

<br>

<br>

<br>

<table width=100%>

<tr valign=top>

<td><font size=1 color=#5f5f5f face="sans-serif">From:</font>

<td><font size=1 face="sans-serif">&quot;Alex van den Bogaerdt&quot; &lt;alex@vandenbogaerdt.nl&gt;</font>

<tr valign=top>

<td><font size=1 color=#5f5f5f face="sans-serif">To:</font>

<td><font size=1 face="sans-serif">&lt;rrd-users@lists.oetiker.ch&gt;</font>

<tr valign=top>

<td><font size=1 color=#5f5f5f face="sans-serif">Date:</font>

<td><font size=1 face="sans-serif">05/12/2010 03:09 AM</font>

<tr valign=top>

<td><font size=1 color=#5f5f5f face="sans-serif">Subject:</font>

<td><font size=1 face="sans-serif">Re: [rrd-users] RRD PERCENT question

(95 percentile)</font></table>

<br>

<hr noshade>

<br>

<br>

<br><tt><font size=2>&gt; Hello List,<br>

&gt;<br>

&gt; I'm a happy RRD user, but there's something I need help with.<br>

&gt; We're currently moving to RRDtool for accounting &amp; billing.<br>

&gt; For this I've created RRD files that keep 105120 5minute samples (1

year).<br>

&gt;<br>

&gt; Now I'm comparing the 95% numbers generated by RRDtool with the 95%<br>

&gt; number generated by our old script. The problem is that these numbers<br>

&gt; are significantly different.<br>

<br>

Note that &quot;the&quot; 95th percentile does not exist. There are many

different <br>

methods of computing this value, and although they are similar and will

more <br>

or less provide the same result, they do differ.<br>

<br>

I don't recall exactly how I started, but I think originally I used <br>

data[n*steps/100]. Then, after some discussion on the mailing list, round()

<br>

was introduced.<br>

<br>

&gt; I hope someone can help me understand why these numbers are so different.<br>

<br>

Because if the array index changes by only one, the returned value may

be <br>

quite different.<br>

<br>

&gt; This is how I determine the 95 percentile number using RRDtool<br>

&gt;<br>

[snipped some]<br>

&gt; &nbsp; &nbsp; &nbsp; &nbsp; VDEF:95thin=inbits,95,PERCENT \<br>

&gt; &nbsp; &nbsp; &nbsp; &nbsp; VDEF:95thout=outbits,95,PERCENT \<br>

<br>

Looking fine.<br>

<br>

&gt; My manual test was done like this:<br>

&gt; 1) fetch rawdata:<br>

&gt; /usr/local/rrdtool-1.2.19/bin/rrdtool fetch \<br>

&gt; --start '1271894400' --end '1272240000' \<br>

&gt; &quot;deviceid11_XXX_Transit.rrd&quot; AVERAGE &nbsp;&gt; OUT_RAW;<br>

&gt;<br>

&gt; 2) read this data with a perl script than sort values and show 95%

number.<br>

&gt;<br>

&gt; In this case the data set contains 1153 samples (no NaN in sample).<br>

&gt; so after sorting the 95% percentile should be the value (times 8 for<br>

&gt; bits) on position 1096.<br>

<br>

&gt; The problem is that this number is quite different (lower) from what

is<br>

&gt; returned using PERCENT above.<br>

<br>

Please see if the number on position 1095 or 1097 equals that of what <br>

rrdtool finds.<br>

<br>

&gt; Note that this sample does not contain any NaN values.<br>

&gt; I also tried this with the latest version of RRDtool, same result.<br>

&gt;<br>

&gt; Can anyone explain why this is different? Is this expected?<br>

&gt; How exactly does RRD this internally?<br>

<br>

Create an array, fill it with the data, use qsort &nbsp;and then find the

correct <br>

spot:<br>

<br>

qsort(array, step, sizeof(double), vdef_percent_compar);<br>

field = round((dst-&gt;vf.param * (double)(steps - 1)) / 100.0);<br>

dst-&gt;vf.val = array[field];<br>

<br>

In here, vdef_percent_compar is a function that sorts NAN &lt; -INF &lt;

numbers &lt; <br>

+INF<br>

<br>

Your calculation: 1153 samples, 95% = 1095,35 so you take 1096.<br>

RRDtool: round(95*1152/100)=1094, based on an array with first member is

0, <br>

so 1094 is the 1095th position.<br>

If I recall correctly, the original version did use truncation instead

of <br>

rounding, which makes no difference in this case.<br>

<br>

Anyway, unless I made a mistake here, rrdtool takes data[1094] and you

take <br>

data[1096]. &nbsp;Your returned value should be higher than what RRDtool

reports.<br>

<br>

&gt; I would like to use RRDtool for this, but need to be sure that the<br>

&gt; numbers are correct, i.e understand why the numbers are different

than<br>

&gt; when calculated manually.<br>

<br>

I would also worry why it is opposite to what I reasoned above.<br>

<br>

_______________________________________________<br>

rrd-users mailing list<br>

rrd-users@lists.oetiker.ch<br>

</font></tt><a href="https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users"><tt><font size=2>https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users</font></tt></a><tt><font size=2><br>

</font></tt>

<br>

<br>