[mrtg] Strange phantom spikes in graphs

Ed LaFrance edl at colocationonline.info
Wed Sep 30 16:52:48 CEST 2009


Hello all -

I've got MRTG v2.15.0 running on various *nix (RH9, or CentOS 4.x / 
5.x, depending upon the monitoring station), polling switches and 
routers in two different data centers. In each DC, there is a core 
router which takes a WAN connection to the Internet, and provides a 
trunk uplink to an L2 switch, with user connections distributed from 
the switch.

I just recently set up MRTG monitoring of one such installation a few 
days ago, and right away I began noticing strange traffic spikes 
appearing on the WAN and trunk router interfaces: sharp, narrow 
spikes of 60Mbps or more once or twice per day, while the router is 
routinely doing < 5Mbps of traffic. I'd paste a image of the graph 
here, but I expect the list server would strip it out. Here's a 
couple examples of the spike data from the log file:

1254271800 18829 59497 20672 60179
1254271500 5842811 55413 14227585 58490
1254271200 8450080 54782 14227585 60320
1254270900 48750 62805 90386 66483
....
1254231000 4863 2866 5591 3438
1254230700 4679 5802650 5959 14265849
1254230400 26779 8508793 57157 14265849
1254230100 36583 65895 57157 109155
1254229800 7168 3566 8047 4718

The spikes are sometimes inbound, sometimes outbound, but only in one 
direction at a time. They only appear on the graphs for the physical 
Ethernet ports on the router; none of the router VLAN graphs for 
users show corresponding spikes. Furthermore, the L2 managed switch 
that is serviced by this router is monitored by the same MRTG 
install; it's uplink trunk port, which is connected to one of the 
ports on the router, does not show spikes which correspond to the 
spikes on the router port. This is what's led me to call these spikes 
'phantom' - there's no evidence that they are real, beyond their 
appearance specifically on the router Ethernet port graphs.

This router (and switches for that matter) is identical in hardware 
and firmware to the two others that are being monitored, which show 
no such behavior, so it seems safe to rule out counter or 
communications issues on the router side for the moment. These 
routers also have a built-in stats and graphing feature very similar 
to MRTG. I enabled that on the router in question, and discovered the 
router's own data shows no such corresponding spikes.

All three monitoring stations monitor all three router installations, 
and only the data for this particular router has this issue. The only 
known differences between this router and the other two: 1) it's in a 
different facility/network (the other two are in the same 
facility/network), and 2) this one sits inline between the WAN 
connection and the L2 switch, while the other two have more of a 
'router-on-at-stick' arrangement with relation to the switch.

I've spent some time Googling on this, and come up with a couple 
other persons who were experiencing similar phenomena, but no 
answers. Anybody have any ideas?

Thanks in advance!

Ed



More information about the mrtg mailing list