[mrtg] Re: MRTG Threshold Alerts over Time

Cornwell, Eric J. EJCORNWELL at coopertire.com
Mon Jun 16 15:10:29 MEST 2003


Dan,

Thanks for posting that script, it's exactly what I've been looking for.  I
do have one question though.  You said, "as soon as I enabled thresdir in
the mrtg.cfg file, MRTG no longer triggered threshprog0 when a threshold was
exceeded."  Why do you have to use thresdir?  You are using threshprog0
successfully with out it, correct? If that's the case why can't you just add
the threshprogok value to your config and let it call the reset batch
script.  

I'm also trying to use the batch file you sent but I'm running into
problems.  It seems to die after the first if statement.
Here is my test batch file:
---------------------------------------------
@echo off
if exist c:\test\thres.txt (
	goto code
) else ( 
	echo 0 > c:\test\thres.txt
)
:code
for /f %%a in (c:\test\thres.txt) do (
if "%%a"=="3" echo 0 > c:\test\thres.txt
if "%%a"=="3" echo Test is done!
if "%%a"=="2" echo 3 > c:\test\thres.txt
if "%%a"=="1" echo 2 > c:\test\thres.txt
if "%%a"=="0" echo 1 > c:\test\thres.txt
)
---------------------------------------------
The code at the top just verifies and creates the file if it's not there.  I
thought that might be something you were looking for.

Eric

-----Original Message-----
From: Dan Lowry [mailto:dan.lowry at attbi.com]
Sent: Saturday, June 14, 2003 7:24 PM
To: mrtg at list.ee.ethz.ch
Subject: [mrtg] MRTG Threshold Alerts over Time


There are times when utilization of a cpu , link, etc. spikes up over a
threshold, when you prefer to not send an email, but instead wait for x
number of 5 minute poll
cycles and then send the alert.
With the help of batchworld group , and windowsntscripting.com, we have come
up with something that seems to work on NT (see configuraton below).

This example looks at cpu utilization on a Cisco router, and sends an email
via blat after x4  5 minute poll cycles (20 minutes) of the router CPU
utiliztion over 70%.

There's just one issue with the program.
I tried to add a threshprogok batch file, to reset the variable in the file
to 0 when threshold goes back down before 4 contiguous breaches of the
threshold.
Unfortunately, as soon as I enabled thresdir in the mrtg.cfg file, MRTG no
longer triggered threshprog0 when a threshold was exceeded.
Looking in the docs, it seems to be related to MRTG only triggering one
threshold every hour. Is that correct? If so, is there any workaround?
I disabled threshdir, and now, there's a possibility of  having 1, 2, or 3
polls over threshold, and then back to normal leaving a 1,2, or 3 variable
in the threshold.txt file.
The problem is the next time the CPU utilization is over threshold ,
threshprog0 wil trigger sendmail.bat which will send an email in 5 , 10, or
15 minutes depending on the stale 3,2, or 1 variable in the threshold.txt
file.

(FYI for anyone wanting to try this batch file , I had to create the
threshold.txt, and threshold.tmp files initially and put a "0" in the files,
not sure why, but 
after this they worked fine)

Here's the code that's working well (minus the ability to reset when the
threshold goes back to normal). Any suggestions to simplify the code, or
get it to work without having to create the initial threshold.txt and
threshold.tmp files appreciated:

Here's the configuration in the MRTG File to trigger the alert after x4 mrtg
poll cycles of >70% cpu utilization
----------------------------------------------------------------------------
----------------------------------------------------------------------------
----
Target[nrbu_2500_cpu]: 1.3.6.1.4.1.9.2.1.58.0&1.3.6.1.4.1.9.2.1.58.0:x at x
WithPeak[bu_2500_cpu]: wmy
YLegend[bu_2500_cpu]: CPU Utilization
ShortLegend[bu_2500_cpu]: %
MaxBytes[bu_2500_cpu]: 100
Options[bu_2500_cpu]: nopercent, gauge,noo
Unscaled[bu_2500_cpu]: dwmy
Title[bu_2500_cpu]: Backup_2500 CPU Utilization
Colours[bu_2500_cpu]: GREEN#00eb0c,BLUE#1000ff,BLUE#1000ff,VIOLET#ff00ff
Legend1[bu_2500_cpu]: Average 1 minute CPU Utilization
Legend2[bu_2500_cpu]: Average 5 minute CPU Utilization
LegendI[bu_2500_cpu]:  CPU:
LegendO[bu_2500_cpu]:
ThreshMaxO[bu_2500_cpu]:70
ThreshDesc[bu_2500_cpu]: CPU Utilization on Backup 2500
ThreshProgO[bu_2500_cpu]:e:\mrtg\thresholds\sendmail.bat


Send an Email using Blat  if CPU Utilization remains above 70% for 20
minutes
----------------------------------------------------------------------------
-----------------------------------------

:sendmail.bat
@echo off
set mrtg_name=%1
set breach_value=%2
set actual_value=%3

for /f %%a in (e:\mrtg\thresholds\threshold.txt) do (
if "%%a"=="3" ECHO CPU Utilization on %1 has been  %2 Percent for the last
20 Minutes!  > e:\mrtg\thresholds\cpu_util.txt
if "%%a"=="3" Next EMAIL in 20 Minutes if problem persists.  >>
e:\mrtg\thresholds\cpu_util.txt
if "%%a"=="3" e:\mrtg\sendmail\blat.exe e:\mrtg\thresholds\cpu_util.txt -s
"CPU Utilization on %1 has been %2 Percent for 20 minutes!" -t
bozo at clown.com
if "%%a"=="3" echo 0 >e:\mrtg\thresholds\threshold.tmp
if "%%a"=="3" del e:\mrtg\thresholds\cpu_util.txt
if "%%a"=="2" echo 3 >e:\mrtg\thresholds\threshold.tmp
if "%%a"=="1" echo 2 >e:\mrtg\thresholds\threshold.tmp
if "%%a"=="0" echo 1 >e:\mrtg\thresholds\threshold.tmp
)
copy e:\mrtg\thresholds\threshold.tmp e:\mrtg\thresholds\threshold.txt
--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
Archive     http://www.ee.ethz.ch/~slist/mrtg
FAQ         http://faq.mrtg.org    Homepage     http://www.mrtg.org
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi

--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
Archive     http://www.ee.ethz.ch/~slist/mrtg
FAQ         http://faq.mrtg.org    Homepage     http://www.mrtg.org
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the mrtg mailing list