[smokeping-users] continuous pinging revisited
Adam Spiers
smokeping at adamspiers.org
Fri May 27 16:26:59 CEST 2011
Hi all,
Firstly thanks for this very useful software.
I'm trying to set up smokeping to run pings continuously, or as close
as possible to continuous. I see this has been attempted or at least
considered before:
http://thread.gmane.org/gmane.network.smokeping.user/4202
My motivation is the same as the original poster there - I need to
capture every single second where there might be network issues.
Therefore I was very surprised to discover that it seems smokeping
does not support this use case.
I found the --nosleep parameter which is mentioned very briefly in the
docs as being "for debugging". Looking at the main while loop in
Smokeping.pm it seems that this option eliminates all sleeps, so it's
not possible to have certain probes running continuously whilst others
sleep as normal.
Then I looked at setting the 'step' configuration parameter value to
the duration of the probe. The docs describe this parameter as
follows:
Duration of the base operation interval of SmokePing in
seconds. SmokePing will venture out every step seconds to ping your
target hosts.
Looking at the source code, I see that the intention is that "every
step seconds" includes the runtime of the probe. For example, if you
have step=60 and pings=10, the probes are launched every 60 seconds,
not every 70 seconds. For continuous pinging, one might think that
setting step=10 and pings=10 would yield the desired results.
However, there are two problems with this. Firstly, the code in
question is:
my $sleeptime = $step - (time-$offset) % $step;
[...]
sleep $sleeptime;
so if the probe takes even a fraction over 10 seconds, it ends up
sleeping for nearly another 10 seconds until the next 10 second
boundary. This means the pings are only happening roughly 50% of the
time, not continuously. A hack might be to set pings=9, but then the
pings are only happening 90% of the time. It's possible to get
arbitrarily close to 100% by making both values very high,
e.g. step=1000 and pings=999, but then you lose the granularity of RRD
results.
The second issue is that there is a hardcoded expectation that the
probe runtime will be less than 80% of the polling cycle:
elsif ($runtime > $step * 0.8) {
my $warn = "NOTE: smokeping took $runtime seconds to complete
1 round of polling. ".
"This is over 80% of the max time available for a polling
cycle ($step seconds).\n";
if (defined $myprobe) {
$probes->{$myprobe}->do_log($warn);
} else {
do_log($warn);
}
}
So if you choose continuous pinging, which seems to me (and presumably
the original poster) to be a perfectly reasonable use case, your
logfiles get spammed with messages.
My workaround for now is as follows:
1. Configure steps=10 and pings=10
2. Comment out the lines causing warnings above
3. Invoke with --nosleep
I had a couple of other minor questions:
1. I see mentions of an svn repository on the mailing list, but
nothing is published. Where is the latest code available, and
what's the standard procedure for submitting patches?
2. Why does --debug exit immediately after the first iteration?
What if you want to debug the sleep cycle?
Many thanks!
Adam
More information about the smokeping-users
mailing list