[mrtg] Re: bug corrupting logfiles?

Paul C. Williamson pwilliamson at mandtbank.com
Mon Jul 2 17:07:57 MEST 2001


Say you've got mrtg running on 3 configs...

MRTG starts up and updates cfg 1.  It takes about 3 minutes. 
MRTG starts to update cfg 2.
2 minutes pass by...
MRTG is called from cron to start again, so it updates cfg 1.
MRTG (second instance) sees there is a lockfile on cfg 2, so it moves on to cfg 3.  It updates cfg 3 and exits out.
MRTG (first instance) has no idea that the second instance has already run, so it tries to update cfg 3.  But since cfg 3 has already been updated, it thinks things are wrong since it didn't update it, and the timestamp it has is earlier than the one in the log file.  It spits out the error below and exits out.

Sound like you need to do something to split the load or just plain decrease it somehow.  Either switch to RRDTool, split up the cfg files and run them at different times, figure out how long each cfg file takes to update, and modify accordingly or look in the archives for other solutions not mentioned here...

Or, I could be barking up the wrong logical tree....

Paul

>>> Håkan Lindholm <hakan at spray.se> 06/26/01 11:50AM >>>

I often get this kind of errors from MRTG (version 2.9.10 on Solaris 8):

Rateup ERROR: /usr/local/mrtg-2/bin/rateup found cat5500.net.458's log file
was corrupt
          or not in sorted order:
time: 991123200.Rateup WARNING: /usr/local/mrtg-2/bin/rateup could not read
the primary log file for cat5500.net.458
Rateup ERROR: /usr/local/mrtg-2/bin/rateup found cat5500.net.458's log file
was corrupt
          or not in sorted order:
time: 991123200.Rateup WARNING: /usr/local/mrtg-2/bin/rateup The backup log
file for cat5500.net.458 was invalid as well
WARNING: rateup died from Signal 0
 with Exit Value 1 when doing router 'cat5500.net.458'
 Signal was 0, Returncode was 1


Until now, I have solved them by deleting the corrupt logfiles, but today I
took a deeper look into the problem. Look what I found:

When examining through the file (not exactly the same as mentioned above), I
bumped into this strange line:

993452700 150 36 151 37
993452400 152 35 238 53
993452 77 496 79
993451500 90 26 161 58
993451200 165 58 348 66


I think this command will be a correct syntax checker.

#egrep -n -v "^9[2-9][0-9][0-9][0-9][0-9][0-9]00 [0-9]+ [0-9]+ [0-9]+
[0-9]+$" cat5500.net.458.log

The output should be just the first three rows from the top of the file, but
now I get things like this:

1:993559039 103540428 33025905
2:993559039 75 89 75 89
3:993558243 0 0 0 0
729:993151800993153600 55 13 62 21

... and ...

1:993559041 1703910136 2262815225
2:993559041 2083019 2766277 2083019 2766277
3:993558223 0 0 0 0
246:993485400 76002 75884 83442 758964800 70862 71265 83480 71931
1245:9919872001600 0 0 0 0



Also, sometimes, it looks like this, but I think there is a fix for this
case in 2.9.11:

1:993558863 -1 -1
2:993558863 0 0 0 0
3:993558263 0 0 0 0
2537:


Thanks in advance for any input on this!

/H

-- 
                                          SPRAY NETWORK SERVICES AB
       System, Network and Security Architecture and Administration
              for Lycos Europe (http://pressroom.lycos.de/english/)
*  S o l a r i s  *  I O S  *  L i n u x  *  W i n d o w s   N T  *

--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe 
Archive     http://www.ee.ethz.ch/~slist/mrtg 
FAQ         http://faq.mrtg.org    Homepage     http://www.mrtg.org 
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi 



--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
Archive     http://www.ee.ethz.ch/~slist/mrtg
FAQ         http://faq.mrtg.org    Homepage     http://www.mrtg.org
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the mrtg mailing list