[rrd-users] trying to understand the relationship between source data, what's in rrd and what gets plotted
Simon Hobson
linux at thehobsons.co.uk
Wed Jul 25 15:19:22 CEST 2007
Mark Seger wrote:
> > That is because hh:mm:06, hh:mm:16, hh:mm:26 and so on are not a whole
>> multiple of 10 seconds.
>>
>> You have "n*step+offset", not "n*step". This is why normalization is
>> needed.
>>
>>
>>
>>> As I said above it sounds like if I conform my data to align to the time
>>> boudary conditions rrd requires it should work and if I don't conform it
>>> won't.
>>>
>>
>> No. Your step size is wrong, not your input. Change your step size
>> to 1,2,3 or 6 seconds
>>
>so if I understand what you're suggesting I should pick a start time and
>step size such that my data will align accordingly, right? Since I have
>samples at 00:01:06, 01:12, etc that would mean I should pick a time
>that lands on a minute boundary and a step of 2 because 00:01:02, 1:04:,
>1:06, etc will still hit all my timestamps. 1 sec would work too but
>that would be overkill. I don't think 3 or 6 would do it because they
>would not all align. 00:01:06 would, but you'd never see 01:16.
Not quite - FORGET THE MINUTE BOUNDARIES
rrdtool uses samples that are a multiple os "step" seconds since unix
epoch - you can easily pick step times which do not fall on minute
boundaries (whilst 7 would not be very common, most times it would
not fall on a minute boundary).
But you are correct that steps of 3 or 6 will not get you 10 second intervals.
>so let's say I have 3 samples of 100, 1000 and 100 starting at
>00:01:06. since these are absolute numbers for 10 second intervals,
>they really represent rates of 10/sec, 100/sec and 10/sec. am I then
>correct in assuming that rrd will then normalize it into 15 slots with
>20/slot for the first 5, 200 for the next 5 and then 20 for the next 5,
>all aligned to 00:01:00.
Actually 10/s is 10/s - not 20/s ! 10/s * 2s would get you 20.
> so starting at 01:00 the data would look like
>20 20 20 20 20 200 200 200 200 200 200 20 20 20 20 20. If I then wanted
>to see what the rate is at 01:06, rrd would see a value in that 2 second
>slot of 20 and treat it as a rate of 10/sec. the same would hold for
>any of the 200s which would be reported as 100/sec for the slots they
>occur in, right?
>
>this is certainly a lot closer to what I was looking for and gets back
>to really clarifying my original question which was the subject of this
>thread. I guess the negatives here are you have to be real careful to
>pick the right time and stepsize and if your samples don't land on
>integral time boundaries all bets are off (what if my samples were at
>00:01:06.5, 00:01:12.5, etc?). it would also make my rrd database 5
>times bigger and it's already over 10MB for 1 day's worth of data.
Al alternative for handling your historical data might be to simply
'lie' about the timestamps ! Eg, for your 00:01:06 sample, insert it
with a timestamp of 00:01:00, 00:01:16 as 00:01:10 and so on. You'll
have a slight blip as you change to actually collecting the data on
10s steps (instead of n*10+6 steps) but it would allow you to graph
your historical data without going to 2s steps.
>btw - just to toss in an interesting wrinkle did you know if you sample
>network statistics once a second you will periodically get an invalid
>value because of the frequency at which linux updates its network
>counters? the only way I'm able to get accurate network statistics near
>that rate is to sample them every 0.9765 seconds. I can go into more
>detail if anyone really cares. 8-)
I'm curious ...
More information about the rrd-users
mailing list