[mrtg] Re: MRTG High Availability
Greg.Volk at edwardjones.com
Greg.Volk at edwardjones.com
Tue Apr 23 16:51:18 MEST 2002
>
> I've been working with MRTG for some time now, and we've come to rely on
> the output it produces. I've now been asked to move MRTG from one
> server onto two for high-availability reasons.
>
> Basically, I'm wondering if anyone has developed any sort of setup
> involving failover. i.e.: if one server running MRTG fails, another one
> takes up the slack. If anyone has a suggestion, I'm all ears.
>
I've put some idle (toilet) thought into this as I suspect
I'll be asked to do the same as soon as my current platform
incurs a noticable outage.
Boss:
"What do you mean there's no redundancy for this system??"
Me:
"Ummmm...well, I kind of deployed it with old spare hardware
and very little free time, so it never was exactly an 'approved,
official system' thus it wasn't identified as 'mission
critical'"
>From a simple-failover point of view, deploying mrtg on two
seperate boxes and having them both be data collectors (mrtg)
and data publishers (web servers), the goal of fault tolerance
is well within reach. The addition of a stateful load balancer
(cisco local-director, radware WSD, etc) would probably
complete the redundancy package quite nicely.
There are at least two problems with the above statement:
One machine crashes. You get it back on line 30 minutes
later. What do you do about syncing all the RRDs to the
machine that stayed up so there are no gaps in the failed
server's data? While I'm sitting here, the only thing that
comes to mind is to copy the RRDs within <poll interval>
time. This may be a problem if you're dealing with thousands
upon thousands of RRDs that will take longer than <poll
interval> time to replicate. Also, if the two servers are
seperated by a relatively slow WAN link (for geographic
redundancy) copying that many small files in less than <poll
interval> minutes will only exacerbate the time problem.
The other caveat that comes to mind is that with a redundant
polling server you multiply _all_ of your mrtg-related snmp
traffic by two. This is only an issue where you might be
dealing with small or congested wide area links.
--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
Archive http://www.ee.ethz.ch/~slist/mrtg
FAQ http://faq.mrtg.org Homepage http://www.mrtg.org
WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
More information about the mrtg
mailing list