I was using an older version and after updating it has definitely improved the speed. I also changed the other options everyone mentioned. <br><br>It still doesn't seem to have answered the vulnerability that changes out in the field bring if the configs aren't updated immediately. I just wish there was some way I could make the system be more fault-tolerant instead of having to make more smaller configs to get rid of it.<br>
<br>I'll keep working on it and if I find anything, I'll report back here. =)<br><br>Thanks for the help.<br><br>Brad<br><br><div class="gmail_quote">On Thu, Apr 17, 2008 at 12:38 PM, Daniel J McDonald <<a href="mailto:dan.mcdonald@austinenergy.com">dan.mcdonald@austinenergy.com</a>> wrote:<br>
<blockquote class="gmail_quote" style="border-left: 1px solid rgb(204, 204, 204); margin: 0pt 0pt 0pt 0.8ex; padding-left: 1ex;"><div class="Ih2E3d">On Thu, 2008-04-17 at 11:39 -0500, Brad Lodgen wrote:<br>
> Hi everyone,<br>
><br>
> I'm running a master config with hundreds of include lines and<br>
> thousands of targets.<br>
<br>
</div>Ditto.<br>
<div class="Ih2E3d"><br>
> This type of setup is vulnerable to errors in config files and/or<br>
> changes made in the field not being immediately updated within the<br>
> configs. If there are a few errors or changes out in the field to<br>
> ports causing them to become 'unpollable', it causes the MRTG polling<br>
> interval to go over five minutes because it's retrying those<br>
> interfaces.<br>
<br>
</div>What version are you running? Dead host detection got noticeably better<br>
in 2.15.1<br>
<div class="Ih2E3d"><br>
<br>
> At the moment, with only about 30 error lines in my log(equating to<br>
> about 15 interfaces/targets), it's causing MRTG to take 7-9 minutes to<br>
> complete polling.<br>
<br>
</div>How many forks are you running? More forks will help. I also limit<br>
retries. e.g.:<br>
Target[random-router.example.com.cpu1]:<br>
cpmCPUTotal5secRev.1&cpmCPUTotal1minRev.1:public@random-router.example.com::2:1:1:3<br>
<br>
::2:1:1 is read "try twice. Wait 1 second after the first attempt, and<br>
add a second for each subsequent attempt". So, I have a maximum of 3<br>
seconds. The default is 3 polls with a 10 second timer, or 30 seconds.<br>
<div class="Ih2E3d"><br>
<br>
> As this is a very small percentage compared to the total amount of<br>
> targets being polled, I'm trying to figure out a way to get around<br>
> this, if possible, or at least to minimize the effects.<br>
><br>
> Is anyone else running a system like this or does anyone have<br>
<br>
> suggestions to try?<br>
<br>
</div>Yes. Current code. Plentiful forks. Short timeouts.<br>
<br>
That doesn't affect one other problem I have. If I get an Include: line<br>
without the file existing (it happens, particularly since I generate the<br>
master file from a script reading a database...) then the whole thing<br>
just stops. I would like a "try to include" option that looks for the<br>
file, but if it can't find it will still process the other 471 include<br>
files...<br>
<br>
I know, I know, I should just write it and submit the code.... Maybe in<br>
August I might have a few days...<br>
<br>
--<br>
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX<br>
Austin Energy<br>
<a href="http://www.austinenergy.com" target="_blank">http://www.austinenergy.com</a><br>
<br>
_______________________________________________<br>
mrtg mailing list<br>
<a href="mailto:mrtg@lists.oetiker.ch">mrtg@lists.oetiker.ch</a><br>
<a href="https://lists.oetiker.ch/cgi-bin/listinfo/mrtg" target="_blank">https://lists.oetiker.ch/cgi-bin/listinfo/mrtg</a><br>
</blockquote></div><br>