[rrd-developers] Introduction and patch for partial 95th percentile support

Fri Oct 27 03:19:13 MEST 2000

Content-Type: text/plain;
	charset="iso-8859-1"

Hello Developers,

First, thanks.  I've been using MRTG and now RRDTool (and Cricket, BTW if
they are on here...) for quite a while and love it. This software made the
quick rollout of a customer site a reality as well as keeping my NOC people
quite happy.

My name is Mark Mills.  I'm primarily a Perl coder at Xodiax.com.  I am
Director of Systems Development
here at Xodiax.  Xodiax is a datacenter company specializing in smaller
mid-western cities. We are strong proponents of Free Software and run more
than half our business on Linux.  Part of the deal the programmers requested
when we founded this company was to be allowed to release software back out.
Veins popped out of foreheads when we asked at the last place. =)

My friend Michael Mattingly helped a great deal on this code. He basically
taught me C, then went on vacation and forced me to finish this monster
patch alone.  He deserves a great deal of the credit here.

To give you an idea how new I am to C, it took me more than a day to figure
out why the following caused coredumps. =)
  double *data = NULL; /* ouch ouch ouch */

Attached to this message is patch1.txt (in Unix text format).  This patch
modifies the version number to
1.0.28_p1, tweaks the usual TODO/CHANGES/CONTRIB, adds a few lines to
rrd_format.[ch], and adds a monster chunk of code to rrd_graph.c

The code added to rrd_format.[ch] supports the two new Consolidation
Functions added. They are PERNF and PERZF and transform to CF_95PER and
CF_05PER. Using them as the CF for writing a RRD isn't supported.  In the
consolidation/dataread portion, they are currently treated as AVERAGE.  In
the G?PRINT section, they are hooked up to call a sort procedure and return
the 95% or 05% number.  I'd appreciate it if
people could look this over and let me know how it looks so far.

The sort procedures are pretty fancy.  The first version of this stuffed all
the data in the stream into a large array and qsorted it.  It was memory
intensive and slow. After some discussion we realized that
we only needed to hang on to the top 5% of the data.  The current
implementation is an insertion sort that never works with more than about
5%.  It uses 1/20 of the memory of our first version and is 13 times faster
on datasets of about 9000 numbers. =)

Upcoming changes I'd like to make are: 
* A flag that prevents consolidation when width=0 so that pre-averaging of
data points doesn't cause it to lie to me. 
* Some commandline switch that sets which 95th number you want.  If there
are 37 data points, the 95th
is data[1.85] which I currently treat as data[1].  Some support should be
added for rounding up to data[2] or even interpolating to
(data[1]*0.15+data[2]*0.85).
* Trending 95th so I can graph the running calc.
* Allowing PRINT/GPRINT numbers to become named, constant CDEFs so that they
could be graphed.

Anyway, as we get a chance, you should see more from me and Mike.  Thanks so
much for all you've done to make my life easier and I hope this helps
someone else. Oh, and thanks for reading all this garbage =)

--mark
<mmills at xodiax.com>
<extremely at hostile.org>
look for me as "extremely" on perlmonks.com and slashdot.org!

-- Attached file removed by Listar and put at URL below --
-- Type: text/plain
-- Size: 10k (10305 bytes)
-- URL : http://www.ee.ethz.ch/~slist/pantomime/02-patch1.txt

--
Unsubscribe mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-developers
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi