[rrd-users] More observations and questions on COUNTER
Philip Peake
philip at vogon.net
Fri Oct 22 21:27:06 CEST 2010
A while ago, I asked a question about how to avoid the problem of seeing
a huge spike when something being monitored as a counter gets restarted
(the jump from whatever the last reading was to a lesser value is seen
as a huge number of counts, rolling over the counter to zero.
The fix I used was one suggested by Alex van den Bogaerdt, which was
essentially to insert a NaN to indicate that the counter is now in an
unknown state, followed by a zero, so that the next (real) value will be
represented correctly.
This worked for my tests, so I deployed the fix.
Now, I use a DB which actually holds one month 4 weeks) of data, with a
30 second sampling period.
I use this DB to display three graphs:
Last month
Last day
Last hour
I do this by just setting the start to the appropriate value from <now>.
Strangely, I have noticed that this fix doesn't always work.
What I see if I look back over the data is a sequence looking like this
(simplified, with thee data sources):
T1 1000 1004 997
T2 1010 1020 1003
T3 NaN Nan NaN
T4 NaN NaN NaN
T5 0 0 0
T6 0 0 0
T7 0 0 0
T8 4E6 4E6 4E6
T9 15 12 10
No spike is displayed on the month or day graphs, but one is displayed
on the hour graph.
Two odd things (to me) - Why is rrd still recording a counter roll-over
value?
Why does the same data show a spike on one graph, but not on the other two?
I suppose the third question might be why isn't the roll-over recorded
with the first zero rather than the first non-zero?
More information about the rrd-users
mailing list