[mrtg] Re: ODD spike

Tue Mar 20 04:08:54 MET 2001

> 
> 
> Tuc wrote:
> 
> > 	We just had a really odd spike, wondered if this meant anything to
> > anyone.  It all seems to be from the same Cisco 2924M switch.  The
> > link back to the core router showed :
> > 
> > 985049700 339992 4378958 354785 4611961
> > 985049400 354107 4601147 354785 4611961
> > 985049100 346494 4476797 348507 4476797
> > 985048800 349676 4495566 363122 4711412
> > 985048500 1707548 5510773 16496234 14303751
> > 985048200 16496234 14303751 16496234 14303751
> > 985047900 12999523 12229704 16496234 14303751
> > 985047600 357571 4731227 357571 4731227
> > 985047300 356917 4731227 357571 4731227
> > 985047000 350143 4759958 358655 5090368
> > 985046700 359633 5081847 370392 5090368
> 
> Bummer you left out the first line of the log.  You have a rather
> steady 350kBps input and 460kBps output.  
>
	Yea, about that...
>
> Check if the current
> counter values (1st line) are roughly the same as
> ((current time as found on line 1) - 985048200) * 350000
> ((current time as found on line 1) - 985048200) * 460000
> 
Currently its :

985056925 2804989262 2360130861
985056925 281777 3728789 281777 3728789
985056625 286803 3717211 286803 3717211
985056600 286082 3725401 286803 3815500
985056300 278959 3815500 287741 3815500
985056000 288540 3829267 297335 3980709

usage is down.... 
>
> If so: counter reset (perhaps switch reset).
>
	Switch didn't reset :

	SWITCH uptime is 51 weeks, 2 days, 35 minutes
> 
> Time slot 985048200 to 985048500 is partially damaged, time slot
> 985048200 to 985047900 completely and time slot 985047900 to 985047600
> partially.
>
	What would cause that?
> 
> Time slot 985047600 to 985047900 will be built partially from
> a proper rate around 357000 and partially from a bad rate 16496234.
> The parts are roughly 12999523/16496234*300 = 236 seconds bad and
> 300-236 = 64 seconds good. The spike is visible in three time slots
> so this indicates that you either missed two polls or that you are
> monitoring every 15 minutes on purpose.  The high rate was set
> somewhere near 985048560 and the last good value was near 985047660.
> This translates into 00:21:00 UTC time for the last good value and
> in 00:36:00 UTC time for the bad value.
> 
> Are you monitoring at hh:05, hh:20, hh:35, hh:50 ?
>

0,5,10,15,20,25,30,35,40,45,50,55 * * * * /usr/local/etc/mrtg/bin/mrtg /usr/local/etc/mrtg/mrtg.cfg 1> /dev/null 2>/dev/null

Whats up with missing polls?  I just upgraded recently, and it seems noticable
for those items that my target is something like :

Target[fred-cpu]: `/usr/local/bin/hostmon2mrtg fred.ttsg.com CPUidle %idle`

that it misses polls and results in "0", instead of just continuing the last
number.  I think this is something that always happened, but never knew it.

Anyway.... 3 missed polls?!
> 
> > 	And 3 systems showed :
> > 
> > 985048500 10 68 10 69
> > 985048200 859289 466313 10741005 5828159
> > 985047900 8771822 4759676 10741005 5828159
> > 985047600 12 74 12 74
> 
> > 985048500 8 68 15 72
> > 985048200 824350 902356 10304208 11278624
> > 985047900 8758578 9586841 10304208 11278624
> > 985047600 14 76 14 76
> 
> > 985048500 42679 21037 43941 21870
> > 985048200 87357 927589 760779 11458268
> > 985047900 676256 10122472 760779 11458268
> > 985047600 36302 8592 36302 8592
> 
> These get monitored every 5 minutes... The reset happened around
> TZ="" perl -e 'use POSIX;print ctime((985048200+985047900)/2);'
> Tue Mar 20 00:27:30 2001
>
	As I said, "SWITCH uptime is 51 weeks, 2 days, 35 minutes".
> 
> > 	Any ideas/thoughts??
> 
> This ought to be enough :)
> Hope I didn't make a mistake, it's rather early here (1 hour east of UTC)
> 

	Thanks, Tuc/TTSG Internet Services, Inc.

--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
Archive     http://www.ee.ethz.ch/~slist/mrtg
FAQ         http://faq.mrtg.org    Homepage     http://www.mrtg.org
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi