[mrtg] Strange phantom spikes in graphs
Ed LaFrance
edl at colocationonline.info
Wed Sep 30 16:52:48 CEST 2009
Hello all -
I've got MRTG v2.15.0 running on various *nix (RH9, or CentOS 4.x /
5.x, depending upon the monitoring station), polling switches and
routers in two different data centers. In each DC, there is a core
router which takes a WAN connection to the Internet, and provides a
trunk uplink to an L2 switch, with user connections distributed from
the switch.
I just recently set up MRTG monitoring of one such installation a few
days ago, and right away I began noticing strange traffic spikes
appearing on the WAN and trunk router interfaces: sharp, narrow
spikes of 60Mbps or more once or twice per day, while the router is
routinely doing < 5Mbps of traffic. I'd paste a image of the graph
here, but I expect the list server would strip it out. Here's a
couple examples of the spike data from the log file:
1254271800 18829 59497 20672 60179
1254271500 5842811 55413 14227585 58490
1254271200 8450080 54782 14227585 60320
1254270900 48750 62805 90386 66483
....
1254231000 4863 2866 5591 3438
1254230700 4679 5802650 5959 14265849
1254230400 26779 8508793 57157 14265849
1254230100 36583 65895 57157 109155
1254229800 7168 3566 8047 4718
The spikes are sometimes inbound, sometimes outbound, but only in one
direction at a time. They only appear on the graphs for the physical
Ethernet ports on the router; none of the router VLAN graphs for
users show corresponding spikes. Furthermore, the L2 managed switch
that is serviced by this router is monitored by the same MRTG
install; it's uplink trunk port, which is connected to one of the
ports on the router, does not show spikes which correspond to the
spikes on the router port. This is what's led me to call these spikes
'phantom' - there's no evidence that they are real, beyond their
appearance specifically on the router Ethernet port graphs.
This router (and switches for that matter) is identical in hardware
and firmware to the two others that are being monitored, which show
no such behavior, so it seems safe to rule out counter or
communications issues on the router side for the moment. These
routers also have a built-in stats and graphing feature very similar
to MRTG. I enabled that on the router in question, and discovered the
router's own data shows no such corresponding spikes.
All three monitoring stations monitor all three router installations,
and only the data for this particular router has this issue. The only
known differences between this router and the other two: 1) it's in a
different facility/network (the other two are in the same
facility/network), and 2) this one sits inline between the WAN
connection and the L2 switch, while the other two have more of a
'router-on-at-stick' arrangement with relation to the switch.
I've spent some time Googling on this, and come up with a couple
other persons who were experiencing similar phenomena, but no
answers. Anybody have any ideas?
Thanks in advance!
Ed
More information about the mrtg
mailing list