[rrd-users] How are --start and --end supposed to behave?

Mon Oct 21 03:57:10 CEST 2013

> Tobias Oetiker <tobi <at> oetiker.ch> writes:
>>
>> Hi Peter,
>
>
> I hope it is not considered unpolite to reply to this very old post,
> but I didn't find anything really applicable after this...
>
>> what Alex says about time stamps I agree with (I think), so if you
>> see data with a time stamp then this means that it is valid for
>> the interval ending at that point in time so if the time stamp says
>> 11:00 then the data associated with it is valid between 10:55 and
>> 11:00 (assuming a 300 second step)
>
> OK, if I get it the value returned by fetch or put in a graph at a
> point in time represents what happened in the previous interval up
> to that point in time. Makes sense.

It's not a value, and it's not a value at a point in time.

It is a rate, one which is valid during an interval with a specific
duration (step times steps) ending at the specified time.

Timestamps themselves have no duration. I believe mixing time and duration
is what cause{d|s} the (IMHO) off by one errors in the amount of data
returned by RRDtool.

>> As for the amount of data fetch returns, the intended behavior is
>> for it to return enough data to 'cover' the requested time
>> interval at a resolution equal or better than requested ...
>
> I don't fully get it. Reading through some tutorials (e.g.
> http://www.vandenbogaerdt.nl/rrdtool/tutorial/rrdcreate.php) it
> seems that it would be wise to calculate the number of items to
> store in a RRA according to the number of pixels that we want in the
> final image. So, being able to calculate the exact number of items
> that a query would get can actually make a difference.

Example: The inner part of an image is 288 pixels wide, step equals 60
seconds, you graph 24 hours.

In this example every pixel covers 24*60*60/288=300 seconds. That is 5
steps (5 PDPs).  If available, RRDtool would select the RRA that has 5
PDPs per CDP *AND* covers at least 24 hours (288 rows) of data.
However, if that is not available, but an RRD has 1 PDP per CDP and
contains at least 1440 rows, RRDtool can still use it, do some on the fly
consolidation, and deliver what is requested.

That is what "resolution equal or better" means.

But then there's also "return enough data".

I think on several occasions "the intended behaviour" was not exactly "the
behaviour". Or my definition of "enough" is too strict.

When dealing with discrete steps, such as a graph in PNG format, with a
specific amount of pixels being used for the graph, I think it is
important to be able to rely on what is returned by fetch or graph.

If my workspace is 360 pixels, then I want 360 intervals, not 361.
Although technically speaking 361 is also enough, I reason that it is more
than enough, and more is not always better. I consider it to be an off by
one error which causes problems when showing minimum, maximum,average,
first, or last.

Only when it is needed to expand the amount of time RRDtool is supposed to
do so.  For instance when asking for 299 seconds, which is not available,
but 300 seconds is.

> As I see it in 1.4.7, if I have:
>
> --start a multiple of $step
> --end=start+$(($N * $size))
>
> I will get $N+1 items, so I should plan my graph to actually
> be $N+1 pixels wide.
>
> Should I upgrade to something different?

If you ask for "--start {some number being N*300} --end start+300" then
one interval is enough and no more than one interval should be returned.
IMHO that is.  It used to work that way, and so far I have not seen any
good reason to depart from that.  Again IMHO it is a bug to return 2
intervals.

The workaround is to ask for end being $N * $size and then start being end
- $size +1. Thus, if your CDP size is 300 seconds, ask for "--end {some
number} --start end-299" and make sure not to have an RRA being a better
match than the one with 300 second per CDP.

just my 2cts