[rrd-users] RRDTOOL and MRTG - Data out of bounds problems

Jason Frisvold Jason.Frisvold at corp.ptd.net
Tue Mar 26 16:06:14 MET 2002


I'm not sure where to begin with this one..  I'm completely puzzled...
Well, here goes :

We monitor several thousand nodes.  I have it running on a single box,
and it seems to keep up quite well.  I've run into a problem, however,
and I can't seem to figure out what's happening...  Every once in a
while (esp during a network outage) the graphs will spike to ridiculous
levels.  While I'd love to have my OC-3 circuits be capable of handling
4+Gb of data, it just isn't possible.  I cannot figure out where these
spikes are being sourced from and it's throwing the resolution of the
graphs off, not to mention severely skewing the data...

Rrdtool 1.0.33 and MRTG 2.9.17.  I looked in the latest release notes
for both and nothing jumped out at me as a fix, so I'm asking here...

Here's a snippet from an rrd dump : (Note, these values are in cps,
multiply by 424 to get bps)

         <!-- 2002-03-25 22:55:00 EST / 1017114900 --> <row><v>
9.8783947044e+03 </v><v> 8.2932120773e+02 </v></row>
         <!-- 2002-03-25 23:00:00 EST / 1017115200 --> <row><v>
8.8742477109e+03 </v><v> 8.1046839506e+02 </v></row>
         <!-- 2002-03-25 23:05:00 EST / 1017115500 --> <row><v>
9.5606750720e+03 </v><v> 8.8151010879e+02 </v></row>
         <!-- 2002-03-25 23:10:00 EST / 1017115800 --> <row><v>
9.3830737949e+03 </v><v> 7.6028447393e+02 </v></row>
         <!-- 2002-03-25 23:15:00 EST / 1017116100 --> <row><v>
9.6443699550e+03 </v><v> 6.4293195884e+02 </v></row>
         <!-- 2002-03-25 23:20:00 EST / 1017116400 --> <row><v>
2.7414486911e+04 </v><v> 1.7387151058e+04 </v></row>
         <!-- 2002-03-25 23:25:00 EST / 1017116700 --> <row><v>
4.9489987708e+04 </v><v> 3.8462694445e+04 </v></row>
         <!-- 2002-03-25 23:30:00 EST / 1017117000 --> <row><v>
4.9586948803e+04 </v><v> 3.6820430671e+04 </v></row>
         <!-- 2002-03-25 23:35:00 EST / 1017117300 --> <row><v>
4.7220395444e+04 </v><v> 4.4622033222e+04 </v></row>
         <!-- 2002-03-25 23:40:00 EST / 1017117600 --> <row><v>
4.1974732778e+04 </v><v> 3.9794486778e+04 </v></row>
         <!-- 2002-03-25 23:45:00 EST / 1017117900 --> <row><v>
9.6909213444e+06 </v><v> 6.9863593690e+06 </v></row>
         <!-- 2002-03-25 23:50:00 EST / 1017118200 --> <row><v>
7.3506765068e+05 </v><v> 5.2989949105e+05 </v></row>
         <!-- 2002-03-25 23:55:00 EST / 1017118500 --> <row><v>
0.0000000000e+00 </v><v> 0.0000000000e+00 </v></row>
         <!-- 2002-03-26 00:00:00 EST / 1017118800 --> <row><v>
0.0000000000e+00 </v><v> 0.0000000000e+00 </v></row>
         <!-- 2002-03-26 00:05:00 EST / 1017119100 --> <row><v>
0.0000000000e+00 </v><v> 0.0000000000e+00 </v></row>
         <!-- 2002-03-26 00:10:00 EST / 1017119400 --> <row><v>
0.0000000000e+00 </v><v> 0.0000000000e+00 </v></row>
         <!-- 2002-03-26 00:15:00 EST / 1017119700 --> <row><v>
0.0000000000e+00 </v><v> 0.0000000000e+00 </v></row>
         <!-- 2002-03-26 00:20:00 EST / 1017120000 --> <row><v>
0.0000000000e+00 </v><v> 0.0000000000e+00 </v></row>
         <!-- 2002-03-26 00:25:00 EST / 1017120300 --> <row><v>
0.0000000000e+00 </v><v> 0.0000000000e+00 </v></row>
         <!-- 2002-03-26 00:30:00 EST / 1017120600 --> <row><v>
0.0000000000e+00 </v><v> 0.0000000000e+00 </v></row>

Notice the huge jump at 23:30?  That's what I'm talking about...  The
0's later on were due to a network outage (DDOS Attack)...  I'm not sure
what caused the traffic to spike so high there!  If anyone has any
ideas, please let me know..  Email is best, but I do monitor the list!

The header of this RRD is below :

<rrd>
   <version> 0001 </version>
   <step> 300 </step> <!-- Seconds -->
   <lastupdate> 1017153907 </lastupdate> <!-- 2002-03-26 09:45:07 EST
-->

   <ds>
      <name> ds0 </name>
      <type> COUNTER </type>
      <minimal_heartbeat> 600 </minimal_heartbeat>
      <min> 0.0000000000e+00 </min>
      <max> 1.9440000000e+07 </max>

      <!-- PDP Status -->
      <last_ds> 2486343273 </last_ds>
      <value> 3.7712792642e+05 </value>
      <unknown_sec> 0 </unknown_sec>
   </ds>

   <ds>
      <name> ds1 </name>
      <type> COUNTER </type>
      <minimal_heartbeat> 600 </minimal_heartbeat>
      <min> 0.0000000000e+00 </min>
      <max> 1.9440000000e+07 </max>

      <!-- PDP Status -->
      <last_ds> 3092888405 </last_ds>
      <value> 2.8103890301e+05 </value>
      <unknown_sec> 0 </unknown_sec>
   </ds>

<!-- Round Robin Archives -->
   <rra>
      <cf> AVERAGE </cf>
      <pdp_per_row> 1 </pdp_per_row> <!-- 300 seconds -->
      <xff> 5.0000000000e-01 </xff>

      <cdp_prep>
         <ds><value> NaN </value>  <unknown_datapoints> 0
</unknown_datapoints></ds>
         <ds><value> NaN </value>  <unknown_datapoints> 0
</unknown_datapoints></ds>
      </cdp_prep>

Thanks!
---------------------------
Jason H. Frisvold
Senior ATM Engineer
Engineering Dept.
Penteledata
CCNA Certified - CSCO10151622
friz at corp.ptd.net
---------------------------
"I love deadlines.  I especially like the whooshing sound they make as
they go flying by." -- Douglas Adams [1952-2001]



--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-users
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the rrd-users mailing list