[smokeping-users] Alerting when a Slave stops sending data

Bill Houle bhoule at siliconexus.com
Sun Apr 22 23:47:26 CEST 2018


As someone who recently had to implement a monitor of not-smokeping processes, might I suggest “monit”? It is a fairly mainstream package that is readily available in yum and apt-get repos. 

Monit is a locally-installed (ie per slave) daemon process that can monitor files (by timestamp or checksum), processes (by PID), programs (by exit code), and system (by resource consumption). It has a flexible config language that can alert/start/stop/exec based on those monitor conditions. 

I could see monit being used to watch each slave and alert and/or auto-restart the data collection. 

—bill



> On Apr 22, 2018, at 11:29 AM, Gregory Sloop <gregs at sloop.net> wrote:
> 
> This is an awesome idea - and one I've wished for in the past - but never got around to working on.
> Checking the slave data files modification times seems plausible as a way to check updates - but you'd have to test to be sure. [IIRC that will work though.]
> 
> Personally, I'd probably try to write it in bash - or something completely external to smokeping. [Bash because of few dependicies - though you'll probably want/need something like sendemail for email notifications...
> 
> If slaves are behind NAT or something similar, you'll have to have a way to get to the slave for handling a restart, but that's really outside the scope of what you're doing. 
> 
> Honestly, simply getting notification that a slave is not pushing updates would be more than enough - even without the restart.
> 
> Sounds fab to me. And I can't think of a better way, off hand.
> 
> -Greg
> 
> 
> Hello,
> 
> I have a Debian Jessie box with Smokeping 2.6 installed on it.
> 
> It receives data from Slaves over the Internet (10 slaves or so).
> Each Slave roughly monitors xDSL or fiber links.
> 
> Every monday, I can see that data from one or two slaves is missing.
> Then I remotely restart smokeping service on slave where data is missing.
> 
> I would like to implement something like:
> 
> - if no data at all from Slave for a given period of time, then restart Slave's smokeping service and send a Notice email
> 
> - if no data at all from Slave for a longer period of time and Slave's restart already attempted, then send a Warning email
> 
> As Slaves data is stored on a known directory ins Master's filesystem, I think I can detect when data from a slave has not been lately  modified, reading directories of files modification times.
> 
> Is there a better way to do so ? Alert's settings seem more appropriate when WAN links in my case, are slower.
> 
> Best regards
> 
> 
> -- 
> Gregory Sloop, Principal: Sloop Network & Computer Consulting
> Voice: 503.251.0452 x82
> EMail: gregs at sloop.net
> http://www.sloop.net
> ---
> _______________________________________________
> smokeping-users mailing list
> smokeping-users at lists.oetiker.ch
> https://lists.oetiker.ch/cgi-bin/listinfo/smokeping-users
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.oetiker.ch/pipermail/smokeping-users/attachments/20180422/83d2cf48/attachment.html>


More information about the smokeping-users mailing list