[smokeping-users] Strange Alert Behaviour
John L Hoo
jhoo at antapex.ca
Tue Nov 1 21:30:01 MET 2005
I am getting gaps in my smokeping data when I have alerts configured for
targets. I see the same gaps for each target, whether an alert is configured
for that target or not. It is almost as if the server lost connectivity to
the network. But it did not.
If I remove the alerts for the configuration of the targets - no more gaps
in the data.
Here is my Database Configuration
*** Database ***
step = 60
pings = 5
# consfn mrhb steps total
AVERAGE 0.5 1 1008
AVERAGE 0.5 12 10800
MIN 0.5 12 10800
MAX 0.5 12 10800
AVERAGE 0.5 144 24000
MAX 0.5 144 24000
MIN 0.5 144 24000
Here are my alerts....
+lossdetect-3m
type = loss
pattern = ==0%,==0%,==0%,==0%,>20%,>20%,>20%
comment = 3 minute packet loss detection
+latency-3m
type = rtt
pattern = <100,<100,<100,<100,>100,>100,>100
comment = 3 minute high latency of greater than 100ms
+latency-change
type = matcher
pattern =
Avgratio(historic=>60,current=>60,comparator=>'>',percentage=>300)
comment = last hour latency 300% of previous hour
Here is my Fping config....
+ FPing
binary = /usr/local/sbin/fping
packetsize = 79
timeout = 0.2
Here is a sample target...
+++Bell
menu = Bell
title = Bell
probe = FPing
host = <<ip address deleted>>
alerts = lossdetect-3m,latency-3m,latency-change
Here is the RRD fetch...
[root at adelx02 etc]# rrdtool fetch /data/smokeping/var/Terago/Peers/Bell.rrd
AVERAGE | grep "nan nan nan nan nan nan nan nan"
1130850720: nan nan nan nan nan nan nan nan
1130851440: nan nan nan nan nan nan nan nan
1130852160: nan nan nan nan nan nan nan nan
1130857920: nan nan nan nan nan nan nan nan
1130858640: nan nan nan nan nan nan nan nan
1130860080: nan nan nan nan nan nan nan nan
1130860800: nan nan nan nan nan nan nan nan
1130876640: nan nan nan nan nan nan nan nan
And my syslogs
Oct 31 16:04:25 adelx02 smokeping[6615]: FPing: WARNING: smokeping took 127
seconds to complete 1 round of polling. It should complete polling in 60
seconds. You may have unresponsive devices in your setup.
Oct 31 16:14:25 adelx02 smokeping[6615]: FPing: WARNING: smokeping took 127
seconds to complete 1 round of polling. It should complete polling in 60
seconds. You may have unresponsive devices in your setup.
Oct 31 19:35:25 adelx02 smokeping[6615]: FPing: WARNING: smokeping took 127
seconds to complete 1 round of polling. It should complete polling in 60
seconds. You may have unresponsive devices in your setup.
Oct 31 21:51:25 adelx02 smokeping[6615]: FPing: WARNING: smokeping took 127
seconds to complete 1 round of polling. It should complete polling in 60
seconds. You may have unresponsive devices in your setup.
Oct 31 22:00:25 adelx02 smokeping[6615]: FPing: WARNING: smokeping took 127
seconds to complete 1 round of polling. It should complete polling in 60
seconds. You may have unresponsive devices in your setup.
Oct 31 23:34:25 adelx02 smokeping[6615]: FPing: WARNING: smokeping took 127
seconds to complete 1 round of polling. It should complete polling in 60
seconds. You may have unresponsive devices in your setup.
Nov 1 07:41:25 adelx02 smokeping[6615]: FPing: WARNING: smokeping took 127
seconds to complete 1 round of polling. It should complete polling in 60
seconds. You may have unresponsive devices in your setup.
Nov 1 07:46:25 adelx02 smokeping[6615]: FPing: WARNING: smokeping took 127
seconds to complete 1 round of polling. It should complete polling in 60
seconds. You may have unresponsive devices in your setup.
Nov 1 11:20:25 adelx02 smokeping[6615]: FPing: WARNING: smokeping took 127
seconds to complete 1 round of polling. It should complete polling in 60
seconds. You may have unresponsive devices in your setup.
Thanks,
John.
--
Unsubscribe mailto:smokeping-users-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:smokeping-users-request at list.ee.ethz.ch?subject=help
Archive http://lists.ee.ethz.ch/smokeping-users
WebAdmin http://lists.ee.ethz.ch/lsg2.cgi
More information about the smokeping-users
mailing list