[rrd-users] Re: Network downtime or uptime

Alex van den Bogaerdt alex at ergens.op.het.net
Sat Jan 31 17:10:05 MET 2004

On Fri, Jan 30, 2004 at 04:36:48PM -0800, jmosh at shaw.ca wrote:

> > > Perhaps this isn't as big a deal to most users, I don't know. 
> > But anyway, my boss would like some stats to show our overall 
> > campus networks health/connectivity rates... anyone tried to 
> > gather this kind of stats using RRDtool and if so, would you like 
> > to share your approach?
> > >  
> > > I have a crude approach:
> > > 1. do an snmpget on a switch
> > > 2. based on the response, increment an uptime or downtime 
> > counter (both are DS in an rrd)
> > > 3. generate graphs showing downtime (percentage) and uptime
> > 
> > Why two counters.  Isn't uptime+downtime equal to 100% ?
> > 
> > How are you going to determine the amount of up- and downtime in 
> > each 
> > polling interval?  Perhaps device uptime is the best to look at.
> > 
> > If you have a redundant router (or so), when do you speak of downtime?
> > 
> > HTH
> > Alex
> > -
> Well, being new to RRDtool, I'm still figuring out what is and isn't possible with the tool... so anyway, I thought two counters so one is for uptime the other for downtime. If the switch responds to the snmpget, increment the uptime, if not, increment the downtime. Graphs could then be made using the two counters to show percentage up and percent down... you're right, the sum of the two would always be 100... This way at any given moment, the PDP can tell me the total time device is up, and down...

RRDtool does rates.  Don't think in numbers, think numbers per interval.

> Now here's another thought... do you have to have consiladated data? can you creat an RRD that only stores 5 minute-interval data for whole month? I don't want to average or consolidate the numbers... 

Sure.  How are you going to view the uptime for a 30-day period?  Are you going to create
images that are 30 * 24 * 12 pixels wide?
(30 days times 24 hours a day times 12 intervals an hour -> 8640 pixels a month)

And what's wrong with consolidating uptime?
If a device is down,down,down,down,down,down,up,up,up,up,up,up (total of 12 * 5 minutes) then
it is also down,up (total of 2 * 30 minutes).  In both cases, it was up during 50% of the time.

> As for the uptime/downtime during each interval, the key word here is crude :) approach... I guess I'm expecting actual downtime to be so small compared to uptime, that if I don't get a response during one poll, I assume the device has been down for that time interval (5 minutes)... that'd be the resolution I guess. Its not totally accurate, but heck, its better than what we have in place now, which is no data period.

Most devices have an uptime counter.  Use it if you can.

begin  sig
This message was produced without any <iframe tags

Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-users
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi

More information about the rrd-users mailing list