[rrd-developers] interest in computed data sources?

Jake Brutlag jakeb at microsoft.com
Fri Feb 16 18:43:52 MET 2001


RRD Developers and Tobi,

The CDEF feature of RRDtool provides a mechanism, for graphing purposes, to
create a virtual data source derived from the other data sources available
(in fact even data sources in other rrd files). I think it would be valuable
to introduce a computed data source. This would be a physical data source
stored in the rrd that is computed using an RPN calculator from the other
data sources (for that rrd only). This feature is common in many database
systems (known as computed columns) and is a complement to true virtual
columns (like the CDEF) which are supported via database views. The
trade-off here is disk space and slightly increase processing cost on update
versus complete computation upon read (graph generation).

Now some of the functionality of a computed data source can be achieved
today by adding a new data source and having the data collection mechanism
do the computations before feeding the data to RRDtool. However, there are
several advantages that support for computed data sources within the tool
can provide:
(1) RRDtool is in C, the collection mechanism may be in an interpreted
language, such as Perl (i.e. Cricket)
(2) For COUNTER and DERIVE data sources, RRDtool already does some
processing (namely taking differences and handling wrap). A collection
mechanism is ignorant of this and therefore cannot apply a computed data
source formula to the values (the differences) of a COUNTER or DERIVE data
source.
(3) RRDtool already has an RPN calculator (although embedded in rrd_graph,
it could be turned into a separate function).
(4) RRDtool handles the temporal synchronization. Suppose I want to define
ds3 as ds1 + ds2 (ds1 and ds2 are both GAUGEs). If I call rrdtool update via
the command "rrdtool update dummy.rrd 957300600:4000:4331", it is clear that
the collection mechanism can easily compute the sum. On the otherhand,
suppose the update to ds1 and ds2 are slightly out of sync:
rrdtool update dummy.rrd -t ds1 957300601:4000
rrdtool update dummy.rrd -t ds2 957300607:4331
Now the data collection cannot compute the sum, but this can be done within
RRDtool because it does synchronization of primary data points (PDP).

I intend to look into implementing computed data sources. This will require
code changes to core modules such as rrd_update and rrd_create, but I could
roll it into my aberrant behavior detection patch. Tobi has already
expressed interest in including this patch in the next major version of
RRDtool (will that be 2.0?). Is there interest in a computed data sources
feature?

FYI, I do intending on synching the aberrant behavior detection patch with
the latest rrdtool-1.0.x release at some point, but I am not certain I will
synch it with the pending 1.0.29 release (depends on how long 1.0.29 remains
the 'current' release).

As always, your comments are welcome.

Jake

Jake Brutlag
Network Analyst
Microsoft WebTV 

--
Unsubscribe mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-developers
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the rrd-developers mailing list