[rrd-users] bogus spikes with Infiniband traffic

Andy Riebs andy.riebs at hp.com
Mon Oct 23 21:47:58 MEST 2006

Our Voltaire Infiniband switches (and perhaps others?) don't wrap their
traffic counters back to zero when they hit their maximum value.
Instead, they simply report "overflow" until a procedure is run to reset
them to zero.

On one of our mid-sized clusters, a handful of port counters are already
overflowing after a couple of hours, and more enter that state as the
day goes on.

To keep useful statistics generally available, we have a procedure that
resets the counters to zero once a day. Unfortunately, when we reset the
counters, rrdtool assumes that the counters have wrapped, and infers
very large traffic bursts through little-used ports.

I've been contemplating adding an `rrdtool reset` command that would
replace the last_ds value and timestamp with specified values, but would
prefer to learn that a better way to handle this already exists :)


Andy Riebs
     HP -- Better Together
High Performance Computing -- XC Linux Software
(w) +1.603.884.1521
    andy.riebs at hp.com

My opinions are not necessarily those of HP

Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://lists.ee.ethz.ch/rrd-users
WebAdmin    http://lists.ee.ethz.ch/lsg2.cgi

More information about the rrd-users mailing list