[rrd-users] Re: percentile calculations
Dan Cech
dcech at phpwerx.net
Tue Sep 5 21:05:24 MEST 2006
Alex,
Thanks for the feedback, I've clarified each point below.
Alex van den Bogaerdt wrote:
> On Tue, Sep 05, 2006 at 02:01:03PM -0400, Dan Cech wrote:
>
>> However, when I'm displaying the graph for the current month, the
>> PERCENT function is using all the unknown future values in the
>> calculation
>
> sure
>
>> causing it to be incorrect.
>
> Why?
In this case I don't mean incorrect from a mathematical standpoint, but
in terms of the purpose of the report.
> You seem to know about your "unknown" data. That means it isn't
> as unknown as the name suggests...
Yes, I can easily disregard it for reporting purposes, that's not the
problem.
My problem is that my employer requires display of a graph showing the
data for the current calendar month, with the 95th percentile line overlaid.
I'm attempting to figure out a solution to produce this without having
to use a separate call to calculate the 95th and manually specify it for
the graph, though it appears that this may be my only workable solution.
>> As a very simplified example, say I'm 10 days into a month (with 20 days
>> remaining) and the values so far look like this:
>>
>> 1,2,3,4,5,6,7,8,9,10
>>
>> The 90th percentile should be 9
>
> according to what/who ?
The 90th percentile of the values above is 9.
I understand that the PERCENT function will include the 20 UNKN values
and produce the answer 7 which is also mathematically correct but
'incorrect' from the point of view of this particular application.
>> I have looked through the documentation and can't find any mechanism
>> which would allow me to restrict the PERCENT function to a specific date
>> range (to exclude values in the future), or exclude NaN values.
>
> Why graph values in the future, you know this won't include useful data.
See above, my employer requires that the graph show the calendar month.
> try changing unknown into some known value, like zero or a very large
> negative number
This will have the same effect as UNKN, if I'm reading the relevant
documentation correctly:
Unknown values are considered lower than any finite number for this
purpose so if this operator returns an unknown you have quite a lot of
them in your data. Infinite numbers are lesser, or more, than the finite
numbers and are always more than the Unknown numbers. (NaN < -INF <
finite values < INF)
It seems that the easiest method may be to pre-calculate the 95th
percentile on just the known data and go from there, though I would like
to avoid the added overhead of opening each RRD twice (these graphs can
span up to 20 rrds) if possible.
Regards,
Dan
--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive http://lists.ee.ethz.ch/rrd-users
WebAdmin http://lists.ee.ethz.ch/lsg2.cgi
More information about the rrd-users
mailing list