[mrtg] Odd Behavior with mrtg + rrdtool

Steve Shipway s.shipway at auckland.ac.nz
Tue Jun 3 00:40:13 CEST 2008


Jonathan Williams spake thusly:
> A traffic generator is set to produce unidirectional traffic at a
> constant rate of 11.6Mbps, fed in to eth0 on my router, and looped back
> to the generator on eth1. I am polling ifInOctets and ifOutOctets on
> both interfaces on 2 second intervals through the management port of the
> router.
>
> The graph I would expect to see would be a flat line for ifInOctets on
> eth0 at 11.6Mbps, and a correlated flat line at 11.6Mbps for ifOutOctets
> on eth1.
>
> Predominately, this is what I see. However, once every 3-4 minutes there
> is a dip in the graph. The dip generally spans 1 or 2 intervals in
> length, and is on the order of magnitude of approximately 1/2 to 1/10th
> of the expected value. I have verified that this is not a dip in the
> actual traffic by querying the hardware directly on the router and
> observing that across the dips in the graph, the delta between the
> direct queries remains constant.

A two second interval is extremely short! 8-o

I would suggest you check the obvious firstly
- Are you using SNMPv2?  If not, do so, if possible.
- Are you generating so much test traffic that the SNMP packets are being dropped?
- With a 2sec interval, this can mean that the interval is smaller than the SNMP timeout or retries time.  Any delay would cause data to be skipped, and possibly interpolated or set to zero (do you have unknaszero set?)  Maybe your MRTG server has slow disks that cannot keep up with the IO stream and it needs to freeze occasionally to flush the output buffer, missing data polls?

I am guessing that the odd dips are when the counter wraps around, or rather when the MRTG or RRD code thinks it /might/ have wrapped around.  Setting to SNMPv2 will make this less frequent and less likely, although a 2sec poll is unlikely to be wrapping until some crazy number of gigabits per second.  Maybe the MRTG wrap detection code gets a bit dodgy at these high poll frequencies?

If using SNMPv2 makes the dips disappear or occur less often, then it is probably a wraparound-detection error.  Similarly, if the dips disappear with lower poll frequencies then it might be because the normalisation routines get upset then the buckets are so small?  I'd need to pore over the code for hours to deduce any possible misbehaviour when the interval is so small.

Hope this helps,

Steve



More information about the mrtg mailing list