[mrtg-developers] Calculating 95th percentile

Westlake, Simon simon.westlake at twcable.com
Thu Oct 11 17:59:48 CEST 2007


So, I've been playing with CDEFS and VDEFS after reading this email and
trying to find a way to do exactly what I need.

As a short introduction, the code you provided works great for drawing a
line at 95% of the max value. If I can find a way to do what I need this
simply, it will make my life a lot easier.

However, what I need to calculate is 95th percentile in billing terms
(see http://en.wikipedia.org/wiki/Burstable_billing). The problem with
the vdef 95 percent method is that if there is any time a user bursts to
100 megs, it will calculate the 95th percentile as 95 megs. The
universally accepted method (from a billing perspective) of calculating
95th percentile is to take all your samples (typically 1 every 5
minutes), sort them from smallest to largest, discard the top 5% and
then bill at the largest value. By this method, if a user had burst to
100 megs 4.9% of the month and had maxed at 12 megs the rest of the
month, the 95th percentile would be calculated as 12 megs. Quite a big
difference..

This was why I had begun by trying to pull a 5 minute sample for every 5
minutes in the billing period and sorting them in an array and
discarding the top 5%.

Before I go too far in figuring out the problem with my fetch statement,
is there some reasonable way of doing this simply with RRDTool that I
should investigate, or for a calculation of this complexity, will I need
to do the calculation via some other means? Looking at the RRDTool
documentation, I don't really see an easy way to do this.


--
Simon Westlake
Time Warner Cable Business Class
Network Engineer
Ph: 414.908.4791 | Cell: 414.688.7956
-----Original Message-----
From: Jo Rhett [mailto:jrhett at svcolo.com] 
Sent: Wednesday, October 10, 2007 2:17 PM
To: Westlake, Simon
Cc: mrtg-developers at lists.oetiker.ch
Subject: Re: [mrtg-developers] Calculating 95th percentile

Which is why I recommend you stop playing with your script and just  
let it do the calculation for you.

If you insist on fixing your script for no perceptible gain, then  
start by doing command-line fetch commands with the same input and  
you'll begin to understand where you've gone wrong.

On Oct 10, 2007, at 10:30 AM, Westlake, Simon wrote:
> I do appreciate the input, but the error is not the 95th percentile  
> function. The issue I am having, before the calculation, is the  
> duplicate data return. I am just trying to understand this issue -  
> my belief was that if I pulled info from a.rrd with --start -20m  
> and --end -15m it would be different than -10m and -5m and it is not.
>
> --
> Simon Westlake
> Network Engineer
> Desk: (414) 908-4791
> Cell: (414) 688-7956
>
>  -----Original Message-----
> From: 	Jo Rhett [mailto:jrhett at svcolo.com]
> Sent:	Wednesday, October 10, 2007 01:25 PM Eastern Standard Time
> To:	Westlake, Simon
> Cc:	mrtg-developers at lists.oetiker.ch
> Subject:	Re: [mrtg-developers] Calculating 95th percentile
>
> You have a problem in your script.  The 95th percentile function
> works perfectly fine.  I don't have time to debug your script but I
> did have time to give you a better method ;-)
>
> On Oct 10, 2007, at 9:54 AM, Westlake, Simon wrote:
>
>> Would that actually perform the correct function though? Wouldn't
>> it also return the same problematic data?
>>
>> --
>> Simon Westlake
>> Network Engineer
>> Desk: (414) 908-4791
>> Cell: (414) 688-7956
>>
>>  -----Original Message-----
>> From: 	Jo Rhett [mailto:jrhett at svcolo.com]
>> Sent:	Wednesday, October 10, 2007 12:35 PM Eastern Standard
Time
>> To:	Westlake, Simon
>> Cc:	mrtg-developers at lists.oetiker.ch
>> Subject:	Re: [mrtg-developers] Calculating 95th percentile
>>
>> Much simpler to just do this:
>>
>> CDEF:pctbits=ds1bits,ds0bits,MAX
>> VDEF:pct=pctbits,95,PERCENT
>>
>> On Oct 10, 2007, at 9:06 AM, Westlake, Simon wrote:
>>> Hello all,
>>>
>>> I'm attempting to write a script to calculate 95th percentile usage
>>> using RRD files generated using MRTG. I'm using PHP and
>>> PHP4RRDTool in
>>> order to pull the information and calculate it and I'm running
>>> into an
>>> issue which has maybe been hidden from me in the past as I use MRTG
>>> as a
>>> collector for my RRD files.
>>>
>>> A quick rundown of my method:
>>>
>>> I have a loop that goes through and pulls 5 minute intervals of data
>>> from the RRD file (by doing --start -10m --end -5m for example, and
>>> moving backwards in steps of 5) for the last calendar month. It adds
>>> this data to an array, sorts it from smallest to largest, strips
>>> the top
>>> 5% and calculates the average.
>>>
>>> The script itself seems to work fine, but the data I get returned
>>> seems
>>> weird.
>>>
>>> After reading the RRD documentation, it appears that I should be
>>> able to
>>> pull the information I want by using the following command:
>>>
>>> $opts = array ( "AVERAGE", "--end", "-$timeframe", "--start",
>>> "-$start");
>>>
>>> Where the $opts array is an array of the options to send. This
>>> works and
>>> gives me a value, however, the values seem to repeat identically.
>>> Here
>>> is a (somewhat brief) example of the data returned over a series  
>>> of 5
>>> minute queries:
>>>
>>> 33748465.7228
>>> 33748465.7228
>>> 33748465.7228
>>> 33748465.7228
>>> 33748465.7228
>>> 33748465.7228
>>> 33748465.7228
>>> 33748465.7228
>>> 33748465.7228
>>> 34784619.7841
>>> <34784619.7841 repeats many times>
>>> 34784619.7841
>>> 34993760.8949
>>> 34993760.8949
>>> 34993760.8949
>>> 34993760.8949
>>> 34993760.8949
>>> 34993760.8949
>>> 34993760.8949
>>> 34993760.8949
>>>
>>> This seems wrong to me - even as an average, I would not be seeing
>>> identical values over separate 5 minute intervals.
>>>
>>> I have tried various changes to see if I receive different
>>> information,
>>> and I do not.
>>>
>>> Am I missing something blindly obvious here? Shouldn't different 5
>>> minute intervals return different information?
>>>
>>> --
>>> Simon Westlake
>>> Time Warner Cable Business Class
>>> Network Engineer
>>> Ph: 414.908.4791 | Cell: 414.688.7956
>>> This E-mail and any of its attachments may contain Time Warner
>>> Cable proprietary information, which is privileged, confidential,
>>> or subject to copyright belonging to Time Warner Cable. This E-mail
>>> is intended solely for the use of the individual or entity to which
>>> it is addressed. If you are not the intended recipient of this
>>> E-mail, you are hereby notified that any dissemination,
>>> distribution, copying, or action taken in relation to the contents
>>> of and attachments to this E-mail is strictly prohibited and may be
>>> unlawful. If you have received this E-mail in error, please notify
>>> the sender immediately and permanently delete the original and any
>>> copy of this E-mail and any printout.
>>>
>>> _______________________________________________
>>> mrtg-developers mailing list
>>> mrtg-developers at lists.oetiker.ch
>>> https://lists.oetiker.ch/cgi-bin/listinfo/mrtg-developers
>>
>> -- 
>> Jo Rhett
>> senior geek
>>
>> Silicon Valley Colocation
>> Support Phone: 408-400-0550
>>
>>
>> This E-mail and any of its attachments may contain Time Warner
>> Cable proprietary information, which is privileged, confidential,
>> or subject to copyright belonging to Time Warner Cable. This E-mail
>> is intended solely for the use of the individual or entity to which
>> it is addressed. If you are not the intended recipient of this
>> E-mail, you are hereby notified that any dissemination,
>> distribution, copying, or action taken in relation to the contents
>> of and attachments to this E-mail is strictly prohibited and may be
>> unlawful. If you have received this E-mail in error, please notify
>> the sender immediately and permanently delete the original and any
>> copy of this E-mail and any printout.
>
> -- 
> Jo Rhett
> senior geek
>
> Silicon Valley Colocation
> Support Phone: 408-400-0550
>
>
> This E-mail and any of its attachments may contain Time Warner
> Cable proprietary information, which is privileged, confidential,
> or subject to copyright belonging to Time Warner Cable. This E-mail
> is intended solely for the use of the individual or entity to which
> it is addressed. If you are not the intended recipient of this
> E-mail, you are hereby notified that any dissemination,
> distribution, copying, or action taken in relation to the contents
> of and attachments to this E-mail is strictly prohibited and may be
> unlawful. If you have received this E-mail in error, please notify
> the sender immediately and permanently delete the original and any
> copy of this E-mail and any printout.

-- 
Jo Rhett
senior geek

Silicon Valley Colocation
Support Phone: 408-400-0550


This E-mail and any of its attachments may contain Time Warner
Cable proprietary information, which is privileged, confidential,
or subject to copyright belonging to Time Warner Cable. This E-mail
is intended solely for the use of the individual or entity to which
it is addressed. If you are not the intended recipient of this
E-mail, you are hereby notified that any dissemination,
distribution, copying, or action taken in relation to the contents
of and attachments to this E-mail is strictly prohibited and may be
unlawful. If you have received this E-mail in error, please notify
the sender immediately and permanently delete the original and any
copy of this E-mail and any printout.



More information about the mrtg-developers mailing list