Well if communication between the two servers was just fine on layer 3 but it couldn't resolve, layer 7, your problem there was that the slave didn't know what IP the master was.<br><br>You could up the TTL to 4 hours and it could have worked in that last scenario, or 8 hours, etc.<br>
<br>For DNS on something like this I suggest you keep a long record, we'll say a week. If you know you're going to change it, change the TTL for half an hour or a full hour a week in advance of the change. Then change it to the new IP and put the TTL back to a week.<br>
<br clear="all">Josh Luthman<br>Office: 937-552-2340<br>Direct: 937-552-2343<br>1100 Wayne St<br>Suite 1337<br>Troy, OH 45373<br><br>"When you have eliminated the impossible, that which remains, however improbable, must be the truth."<br>
--- Sir Arthur Conan Doyle<br>
<br><br><div class="gmail_quote">On Thu, Sep 17, 2009 at 7:29 PM, David Rees <span dir="ltr"><<a href="mailto:drees76@gmail.com">drees76@gmail.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;">
<div class="im">On Thu, Sep 17, 2009 at 4:13 PM, Josh Luthman<br>
<<a href="mailto:josh@imaginenetworksllc.com">josh@imaginenetworksllc.com</a>> wrote:<br>
> To rule out DNS - are the boxes using a DNS cache server on themselves or<br>
> using a secondary server? What's the TTL on those A/CNAME records and how<br>
> long was your outage?<br>
<br>
</div>All the boxes use a caching DNS server - the TTL on the host that went<br>
down that the affected slaves were monitoring was 5 minutes - it was<br>
down for close to 3 hours.<br>
<br>
I've since changed my config to use IP addresses for the host config,<br>
but it'd be nice to not have to and for the slaves to cache the last<br>
lookup in case there is a DNS failure...<br>
<br>
-Dave<br>
</blockquote></div><br>