[smokeping-users] Alert script, with MTR
gregs at sloop.net
Thu Jun 26 21:58:39 CEST 2014
So, I've changed the thread title...
A few updates.
I didn't think it was load, so I tried running the Alert/MTR script *by hand/manually*, while smokeping and nagios are doing their thing - just to test what load was and what the effect was.
I ran about 15 alerts/MTR runs in quick succession - all while smokeping and nagios were also running doing their work.
Load does peak higher than I suspected - at ~2 for the 1 min average - but those queries complete fairly quickly and load drops back to around 0.3-0.4. [and this is way more load from the MTR script than should have been occurring in the automated runs I was doing before.]
However, even with the much higher load, there were no drops in writing the smokeping RRD's and Nagios doesn't complain about them.
So, I think it's safe to say that it's not a load issue - it's that for some reason when smokeping runs the "alert" script, that it has to wait for that script to finish before it goes on to do anything more - and this causes the other issues.
So, I also tried appending a "&" to the smokeping alert line in the config - in the hopes that it would run the process in the background. No luck. [I'd guess it places the "&" before the passed arguments and the script doesn't get any of the passed arguments it needs.]
I thought about creating a script that would run a second script and append the "&" to it, and run it.
"MTR-Create" [a (bash?) script] - would take the arguments it was passed from smokeping [you'd call MTR-Create from the smokeping alert]
MTR-Create would simply take it's arguments and call the "regular" MTR/Alert, passing along those arguments and appending "&" at the end to run it in the background.
I suspect I can struggle my way through doing that - but does any BASH guru know how best to do that, offhand. It could save me a lot of poking, trial and error! :)
How many alerts are firing when your box starts to bog? I have been running my fork of the mtr script for several months now with no issues. Matter of fact, I am now working on an expanded version that will dump the mtr's into mysql for easy access for our NOC. Currently, I just have the script appending a file in /var/log with each mtr.
Could you be pushing the box you are running from too hard?
On Wed, Jun 25, 2014 at 8:12 PM, Gregory Sloop <gregs at sloop.net> wrote:
FP> On 21.02.2014 06:42, Philip Wehunt wrote:
>> I could hackishly work around this in my python but I wanted to
>> identify if I am doing something wrong on the SP side or if it is a
>> bug. Mainly in the spirit of KISS. I don't like to let hackish
>> scripts linger.
FP> You probably found the same script on gist, but here's my version
FP> which doesn't fail when the 6th arg is missing. It will not add "
FP> cleared" to the subject without the arg, but it will send you the report.
FP> : https://git.server-speed.net/users/flo/bin/tree/smokemtr.py
FP> From the documentation in smokeping_config I'd say this is a bug, but
FP> given I get my mails I didn't bother fixing it yet.
First, thanks for the script. I've had to mod it a bit - my MTR isn't quite the same as yours and I want to use a non-local SMTP server and port - but those were easy mods. [MTR is in a different spot too, again easy mod.]
So, I'm very excited about the prospects of automated mtr stats when a smokeping alert gets triggered - however I run into a substantial snag.
I use a 60s poll in smokeping, and if I get a bunch of [smokeping] alerts that kick off, then, when each MTR takes a while to run, it stalls smokeping.
This causes a ripple-effect, and a raft of nagios alerts...since I use a smokeping nagios plug-in. When SP stalls [running the mtr's] the RRD's go dry, and then nagios starts alerting on an "unknown" target state. ["This RRD hasn't been written to in 180s" etc.]
So, is there some way I can fork off the mtr script, and allow smokeping to continue while the mtr stats are gathered and a report sent?
[This is something I'm woefully un-knowledgeable about...]
smokeping-users mailing list
smokeping-users at lists.oetiker.ch
Gregory Sloop, Principal: Sloop Network & Computer Consulting
Voice: 503.251.0452 x82
EMail: gregs at sloop.net
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the smokeping-users