[smokeping-users] Edgetrigger and repeating alert notifications
Gregory Sloop
gregs at sloop.net
Tue Feb 23 23:16:37 CET 2016
Hi I've had problems with repeating alert notifications for a target that has been completely down all the time, it is shut down.
My Alerts pattern section looks as below, and I get repeated 'someloss' triggers for this particular node although I've edgetrigger and I shouldn't have got a second notification, it seem to just continue.
I've read the Smokeping config manual but I'm not sure if my patterns below are false or am I toggling the pattern unexpectedly ?
I have restarted my Smokeping with init script I'm not sure if that means that all states for forgotten and email will be resent?
The alert emails are all for the same target and look the same I get:
pattern: >0%,>5%,>=5%
Loss: S, 100%, 100%, 100%
RTT: S, U, U, U
Please advice on this edge trigger and if I've some faulty configuration below
Standard Smokeping 2.6.8 Ubuntu 14.04 package
+someloss
type = loss
edgetrigger = yes
pattern = >0%,>5%,>=5%
comment = We've got loss 3 times in a row over the past 15min
+hostdown
type = loss
edgetrigger = yes
pattern = ==0%,==0%,==0%, ==U
comment = host down!
Thank you, William
1)
>pattern = >0%,>5%,>=5%
>comment = We've got loss 3 times in a row over the past 15min
That's a *greater* than 0% loss sample followed by one that's greater than 5%, and a second immediately following of >= 5%.
It's not 3 samples of >=%5 over 15m - like in your desc. I suspect it's just the desc that's wrong, and it is actually as you intend.
Wouldn't this produce continuous matches?
Do you perhaps mean something like <5%,>5%,>5% [I tend to use < and >, not == - since I'd consider it "up" if it were 5% or less, not just zero percent loss.]
---
2) >I have restarted my Smokeping with init script I'm not sure if that means that all states for forgotten and email will be resent?
I don't believe smokeping keeps state between restarts. [Nagios does, I think.] So, a restart of SP may well generate a new set of alerts, depending on your alert conditions - even though the state hasn't changed.
---
I've always been fairly frustrated at smokepings' alerts. [It's just not that good at alerts - but it does do it's core work really well, so I'll live with the alert failings.] What is good about alerts is Nagios.
I know I say this in almost every discussion about SP - but really consider using Nagios to handle alerts. There's a smokeping plug-in for Nagios and nagios can monitor a bunch of other things well too.
I use the smokeping alert pipe to run a MTR on the target to document the whole chain and where the problems are, and their severity. And in that case, I don't find it screwing up the edgetrigger operation and generating too many MTR's - so I'm 99% edgetriggger is operating as designed. [I'm using the Ubuntu/debian packaged version too.]
When you have problems with alerts try:
Simpler alert patterns. It's terribly easy to get "tricky" with patterns and then find they didn't work the way you expected. Use greater/less than expressions, rather than equal. Keep them short and as least complicated as possible. A simple pattern is a lot easier to troubleshoot.
HTH
-Greg
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.oetiker.ch/pipermail/smokeping-users/attachments/20160223/f383615d/attachment.html>
More information about the smokeping-users
mailing list