[smokeping-users] Re: Looking for a good way of timing Smokeping

Thu Jan 15 17:18:25 MET 2004

Today Simon Westlake wrote:

Hi Simon,

try this patch:

--- Smokeping.pm.orig   Thu Jan 15 17:14:34 2004
+++ Smokeping.pm        Thu Jan 15 17:15:55 2004
@@ -1899,6 +1899,7 @@
     do_log("Launched successfully");
     report_probes($probes);
     while (1) {
+        my $now = time;
        if ($opt{debug}) {
                map { $probes->{$_}->debug(1) if $probes->{$_}->can('debug') }
                        keys %$probes;
@@ -1906,6 +1907,10 @@
        run_probes $probes;
        update_rrds $cfg, $probes, $cfg->{Targets}{probe}, $cfg->{Targets}, $cfg->{General}{datadir};
        exit 0 if $opt{debug};
+        my $runtime = time - $now;
+        warn "WARNING: smokeping took $runtime seconds to complete 1 round of polling. ".
+             "It should complete polling in $cfg->{Database}{step} seconds. ".
+             "You may have unresponsive devices in your setup.\n" if $runtime > $cfg->{Database}{step};
        sleep $cfg->{Database}{step} - time % $cfg->{Database}{step};
     }
 }


Now smokeping will complain when it is taking too long to complete a round.

How is the load on your machine while smokeping is polling ?

The reason for the gaps when you widen the step is, that your rrds
have the maximal acceptable update time internally. you can use rrdtool tune to change that

tobi



> Hi,
>
> A few weeks ago I posted about gaps in Smokeping graphs, and the eventual conclusion was that it was simply taking too long for Smokeping to run.
>
> I tried running it at 10 minutes rather than 5 but, strangely, there were more gaps at 10 minutes than at 5 (this always seems to be the case for me.. I tried it again recently and had the same problem.)
>
> My previous solution was to remove devices from Smokeping that were regularly unresponsive - their removal seemed to resolve the problem.
>
> So, I'm stuck running at 5 minutes, as anything above that seems to produce more gaps. However, I'm adding 20+ devices a week to Smokeping and I'm starting to get gaps again. I'm assuming it's taking too long to run again, but I only have ~15 unresponsive devices at a time (out of 1300) so it doesn't seem to be a problem with excessive timeouts.
>
> I measure the amount of time MRTG takes to run for monitoring purposes by doing:
>
> x=`date +%s`;z=`date`;/usr/local/mrtg-2/bin/mrtg /usr/local/mrtg.cfg;y=`date +%s`;runtime=`expr $y - $x`;echo "$z runtime was $runtime seconds" >>/home/simon/runtime
>
> I can't, however, think of a good way to do this for Smokeping.
>
> So, two quick questions..
>
> Can anyone think of a way of doing something similar for Smokeping?
> Does anyone have an example of a relatively aggressive probe configuration for fping for monitoring large numbers of devices? I did try modifying parameters to pass to fping as specified in the Smokeping documentation, but I think I must be reading it incorrectly, as I couldn't get the syntax right. I'd be happy to increase the wait by a very slightly increment for successive timeouts and to give up at ~800ms or so.
>
> The eventual solution is going to be to split the polling over a large number of servers, but for the time being, I'm stuck running it on a single box. I'm 99% sure it's a case of excessive wait time, as the server is relatively powerful.
>
> Thanks for any help you can provide.
>
>
> --
> Unsubscribe mailto:smokeping-users-request at list.ee.ethz.ch?subject=unsubscribe
> Help        mailto:smokeping-users-request at list.ee.ethz.ch?subject=help
> Archive     http://www.ee.ethz.ch/~slist/smokeping-users
> WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi
>

-- 
 ______    __   _
/_  __/_  / /  (_) Oetiker @ ISG.EE, ETZ J97, ETH, CH-8092 Zurich
 / // _ \/ _ \/ /  System Manager, Time Lord, Coder, Designer, Coach
/_/ \.__/_.__/_/   http://people.ee.ethz.ch/~oetiker   +41(0)1-632-5286

--
Unsubscribe mailto:smokeping-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:smokeping-users-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/smokeping-users
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi