[rrd-users] Re: Tracking uptime
Alex van den Bogaerdt
alex at ergens.op.het.net
Tue Sep 20 23:29:11 MEST 2005
On Tue, Sep 20, 2005 at 03:31:54PM -0400, Gregory (Grisha) Trubetskoy wrote:
> I am contemplating on how to best track uptime in an RRD. For
> clarification - uptime in this case is not as reported by uptime(1), but
> rather a result of a periodic check and the goal is to get a percentage
> uptime over a period (e.g. 99.97% over past 30 days).
> So I'm thinking of setting up a GAUGE with values from 0 to 100, and
> setting it to 100 every time the check succeeds or 0 every time it fails.
> Then I should be able to get the average over a period of time which would
> result in 'percent up' and the new TREND function can give me a moving
> average of uptime over, say, past 30 days.
This will sort-of-work. Sometimes it is the best you can do. Sometimes
you can do better. Sometimes you can but it isn't worth it.
The problem with this: you are taking one point in time and declare it
valid for an entire period (say 5 minutes). You are doing this, say,
8640 times (which covers 30 days at 5 minutes per interval). There are
thus 8640 chances to get it wrong. You need to judge how big a chance
this is: will you always spot downtime if you take samples every 300 seconds
and is the error not too large. If downtime usually is quite long, the
error will be small. If downtime on the other hand is usually small, the
error will be quite large and you may even miss downtime completely.
> Am I on the right path here, or am I missing some other more obvious way
> to track uptime?
Apart from what I wrote above: the computations are sound.
No need to do 0 and 100. 0 and 1 will also do (0.9997 in stead of 99.97%)
although 0 and 100 aren't bad either.
If possible at all: try to write fractions of 100 (or 1) when you detect
downtime somewhere in the interval but not right now.
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:rrd-users-request at list.ee.ethz.ch?subject=help
More information about the rrd-users