[smokeping-users] Slave gaps in all charts during outage

Tobias Oetiker tobi at oetiker.ch
Fri Sep 18 07:31:29 CEST 2009


Hi David,

there are about a ton of unreleased changes ...

* fix bug in storage of slave updatesL: when the local smokeping daemon
  running only the first update could be read back, the others were hidden
  in the storeable. --tobi

I am looking at
putting some effort into a new release next week.

cheers
tobi

Yesterday David Rees wrote:

> Hi,
>
> We use smokeping to monitor a number of hosts on various networks.  We
> have a master with a handful of slaves which monitor various sites.
>
> This morning we had an outage which affected one of those sites, but
> the slaves which were monitoring the site that went down, failed to
> report any data at all for any networks - even if they were reachable
> from that network.  Communications between the master/slaves were not
> affected.
>
> The affected slaves were reporting this message:
>
> WARNING Master said 500 read timeout
>
> While the master had messages like:
>
> RRDs::update ERROR: /var/lib/smokeping/rrd/slave/slave~site1.rrd:
> illegal attempt to update using time 1253201797 when last update time
> is 1253201797 (minimum one second step)
>
> All machines are running smokeping 2.4.2.  Any ideas?
>
> The only thing I can think of is that DNS for the site that went down
> was also down so the master timed out trying to look it up the site's
> IP address?
>
> Thanks
>
> Dave
>
> _______________________________________________
> smokeping-users mailing list
> smokeping-users at lists.oetiker.ch
> https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
>
>

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
http://it.oetiker.ch tobi at oetiker.ch ++41 62 775 9902 / sb: -9900



More information about the smokeping-users mailing list