[rrd-users] Using RRD with service check monitors (Netsaint/Nagios, Mon, BB etc)
Stanley Hopcroft
Stanley.Hopcroft at IPAustralia.Gov.AU
Mon Aug 19 05:13:27 MEST 2002
Dear Ladies and Gentlemen,
I am writing to request your comments on using RRD as a means of
providing persistence to service checks scheduled by Service Monitors
[SM] such as Netsaint/Nagios, Mon, Big Brother etc.
I am finding that the SM schedlules the checks to suit itself and even
though the check interval my be the same as the RRD step size, the check
may try and update the RRD at any time in the interval [step_boundary -
STEP, step_boundary + STEP] (eg an RRD that expects to be updated each 5
minutes may in fact be updated at times, 6, 12, 18 ..).
Even when the check attempts to force the update on the last step
boundary ( $now = time(); $t = $t - ($t % STEP); RRDs::update ... $t .
':' . $update ;), the data stored in the RRD can vary [Here is a fetch
from the problem RRD] from what the real values should be
should
be
Mon Aug 19 12:00:00 2002 1029722400 364.0 44.0
Mon Aug 19 12:05:00 2002 1029722700 18.0 365 27.0
Mon Aug 19 12:10:00 2002 1029723000 370.0 44.0
Mon Aug 19 12:15:00 2002 1029723300 20.0 371 27.0
Mon Aug 19 12:20:00 2002 1029723600 20.0 375 27.0
Mon Aug 19 12:25:00 2002 1029723900 377.0 44.0
Mon Aug 19 12:30:00 2002 1029724200 20.0 378 27.0
Mon Aug 19 12:35:00 2002 1029724500 378.0 378 44.0
Mon Aug 19 12:40:00 2002 1029724800 379.0 44.0
Mon Aug 19 12:45:00 2002 1029725100 20.0 380 27.0
Mon Aug 19 12:50:00 2002 1029725400 381.0 44.0
Mon Aug 19 12:55:00 2002 1029725700 21.0 381 27.0
Mon Aug 19 13:00:00 2002 1029726000 381.0 44.0
Mon Aug 19 13:05:00 2002 1029726300 0.0 0.0
I really would prefer not to see the 20.0 20.0 20.0 21.0 (un real)
values.
This check is monitoring a slow running processe (not a router
interface) so a 300 second step is appropriate. Should the step be
reduced, the changes would only show over more samples (This is a
producer processe where the first column represents 'Successfully
processed'). I may have to do this.
Also, having the service check sleep until the next step boundary does
not seem feasable because the check could then run for up to 299 seconds
and this would requires change to the monitor (that usually kills checks
that don't return in a timely manner).
Thank you,
Yours sincerely.
--
------------------------------------------------------------------------
Stanley Hopcroft
------------------------------------------------------------------------
'...No man is an island, entire of itself; every man is a piece of the
continent, a part of the main. If a clod be washed away by the sea,
Europe is the less, as well as if a promontory were, as well as if a
manor of thy friend's or of thine own were. Any man's death diminishes
me, because I am involved in mankind; and therefore never send to know
for whom the bell tolls; it tolls for thee...'
from Meditation 17, J Donne.
--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive http://www.ee.ethz.ch/~slist/rrd-users
WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
More information about the rrd-users
mailing list