[rrd-users] How to find first valid dp in rrd - repost - where arethe experts?

Alex van den Bogaerdt alex at vandenbogaerdt.nl
Tue Nov 18 20:29:08 CET 2008


> yes, I've tried that, but no matter which way I'm doing it, it eventually
> ends up having to read the entire database ...

You are looking for the first entry which is not unknown. This means you 
need to start at the beginning and work your way up.  This is no different 
from starting at the end, working your way down and find the last entry 
which is not unknown.  I can see why you would expect 'rrdtool first' to do 
the work for you, I wonder if 'rrdtool last' does what you expect it to do.

An update with rate "U" (unknown)  *is*  a valid update!

Consider two updates, one happening on 2008-10-01 with some known rate, and 
one on 2008-11-18 with either an explicit unknown rate or an unknown rate 
due to heartbeat value being lower than the interval size. What do you 
expect to see when you run rrdtool last ?

> Quite expensive for a database that size, and you have to do multiple
> passes to narrow down to get to the values in the RRA with the lowest
> step size ...

The problem is, I imagine, that you create a database of say 5 years, and 
will be looking from 2003 to now minus some small amount of time and only 
then find the first non-NaN rate somewhere in Januari 2008. Cut a corner; 
you know the database was created in januari 2008, you know you won't be 
storing historic data in this database? Then start looking from januari 
2008, not 2003.  Indeed, you can't store this value somewhere in the 
database (well, you can, but it would be silly).


> And, even worse, if I'm using any graph/xport functions, I'm only looking
> at one of the values in a multiple DS database unless I add additional
> CDEF/VDEF/RPN to look at all the values ... that means I need a different
> function for differently defined rrds :-(

Different data sources can have different first and last anyway. You don't 
need any graph or xport functions for that.


> The bash function I'm currently using (to initially point my user to the
> entire existing data range in my graphical user interface) is this:
> (might be useful for others, too)

[snip]
>  # last is easy
>  rrdLast=$(rrdtool last $1)
[snip]

'rrdtool last' used to look at rrd.live_head->last_up.

Suppose you have updated the database, using an explicit unknown. This would 
increase rrdLast.

If this has changed, thus if 'rrdtool last' actively seeks the first non-NaN 
rate, then I would expect to see a similar change in 'rrdtool first'.



What is the exact problem you are trying to solve, and do you really want to 
solve it using rrdtool? 



More information about the rrd-users mailing list