[rrd-users] Re: Network downtime or uptime

Joubin Moshrefzadeh joubinrc at yahoo.com
Tue Feb 10 17:28:07 MET 2004


Thanks,
I'll look into it...
 
Joubin.

Pierre Fagrell <pierre at syntaxis.se> wrote:
Hi

I think you should look into Nagios

It's a daemon that runs with a MySQL database if I remember correctly, 
you input the adresses of all your switches and servers, and group them 
in a way you find appropriate.
Nagios will att specified intervals check all devices for uptime / 
availability or whatever you specify, for a webserver for example you 
can have it check if the httpd is up, and if the ftpd is up, you define 
warning and alarm levels, if getting a webpage takes 1 sec, its a 
warning, if it takes 10 secs or times out it's an alarm.

Furthermore you can define dependencies, if the switch which the 
webserver is connected too is down, it will only report the switch as 
faulty, not the webserver.

And you can save the logs as long as you want.

If you want a graphical overview of your network, you could look at 
SNPMC network manager, it polls servers, switches, routers and so on, 
you put them in a chart and link them with lines, and it will give alert 
when something is wrong, log the informaition and so on, however, this 
is a commercial product.

Regards

Pierre Fagrell


Joubin Moshrefzadeh wrote:

>Thanks Don for your input. For some reason I didn't see your email till now.
> 
>I agree with everything you say, and well, we too are still working on the definition of uptime/downtime. We sometimes get calls at the helpdesk from clients who can't login to the network, and frankly to them the network is down, but from our perspective, the switch they're plugged into is responsive and we can see it (via telnet, snmpget) so as far as my group is concerned the network is up, and the call gets passed to the server group.
> 
>anyway, for the time being all I'm concerned with is individual switches that make up the network. I want to find a way to poll a switch every n minutes, and based on the result, keep track of date and times when the switch is not responsive. As far as my groups concerned, if the switch is unreachable, then any users downstream of that node are disconnected. Now whether or not someone else on this group is doing something like this, and whether or not they're using RRDtool for this purpose, I'd be interested to find out.
> 
>As this project evolves and I figure out more of what sort of info management would like to get, I'm thinking I'll have to use an actual database tool like MySQL... because they'd like to query things like "Which nodes were down during Jan. 2004 and for how long?"
> 
>also, a side note, does anyone know of a utility that can map out a network, and creat some kind of tree structure/diagram (which can then be integrated into a web page)? 
> 
>thanks,
> 
>Joubin...
>Don wrote:
>Joubin,
>
>First you need to define network availability. In other words, what are you
>trying to
>measure. In a complicated, distributed network this is not easy. What does
>"down"
>mean? If one node is down, is the network down? I don't think so. If one
>port is down
>on a switch is the switch down? This is not an easy problem. I don't know
>how
>big a network you have but I don't believe that you will be able to automate
>it unless
>you tried to ping every node in your network every n minutes which your
>users will
>find unacceptable.
>
>Just my thoughts. We have actually defined what availabiltiy means for our
>network.
>But this is just our definition.
>
>Good Luck,
>
>don gallop
>----- Original Message ----- 
>From: "Joubin Moshrefzadeh" 
>To: 
>Sent: Wednesday, January 28, 2004 4:33 PM
>Subject: [rrd-users] Network downtime or uptime
>
>
> 
>
>>Ok, here's a question... I've seen all sorts of applications of MRTG and
>> 
>>
>RRDtool to keep track of traffic patterns, but no mention of actual network
>downtime or uptime.
> 
>
>>Perhaps this isn't as big a deal to most users, I don't know. But anyway,
>> 
>>
>my boss would like some stats to show our overall campus networks
>health/connectivity rates... anyone tried to gather this kind of stats using
>RRDtool and if so, would you like to share your approach?
> 
>
>>I have a crude approach:
>>1. do an snmpget on a switch
>>2. based on the response, increment an uptime or downtime counter (both
>> 
>>
>are DS in an rrd)
> 
>
>>3. generate graphs showing downtime (percentage) and uptime
>>
>>your thoughts, comments?
>>
>>Thanks,
>>Joubin
>>
>>---------------------------------
>>Do you Yahoo!?
>>Yahoo! SiteBuilder - Free web site building tool. Try it!
>>
>>--
>>Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
>>Help mailto:rrd-users-request at list.ee.ethz.ch?subject=help
>>Archive http://www.ee.ethz.ch/~slist/rrd-users
>>WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
>> 
>>
>
>
>---------------------------------
>Do you Yahoo!?
>Yahoo! SiteBuilder - Free web site building tool. Try it!
>
>--
>Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
>Help mailto:rrd-users-request at list.ee.ethz.ch?subject=help
>Archive http://www.ee.ethz.ch/~slist/rrd-users
>WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
>
>
> 
>



---------------------------------
Do you Yahoo!?
Yahoo! Finance: Get your refund fast by filing online

--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-users
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the rrd-users mailing list