[mrtg] MRTG graphs suddenly stopped updating

Brian Rapp brian.rapp at noaa.gov
Fri Mar 25 17:08:14 CET 2016


Greetings,

I use MRTG 2.17.4 to monitor a system with several thousand nodes. It's 
been running perfectly for nearly year.  Yesterday, the graphs suddenly 
stopped displaying information for every single node. Dumping the rrd 
files shows that every entry since 1700z contains NaN, as shown here:

<!-- 2016-03-24 16:35:00 GMT / 1458837300 --> 
<row><v>1.1346932622e+05</v><v>4.4329144411e+05</v></row>
<!-- 2016-03-24 16:40:00 GMT / 1458837600 --> 
<row><v>1.1090133815e+05</v><v>4.4215029817e+05</v></row>
<!-- 2016-03-24 16:45:00 GMT / 1458837900 --> 
<row><v>1.1164798165e+05</v><v>4.2520479138e+05</v></row>
<!-- 2016-03-24 16:50:00 GMT / 1458838200 --> 
<row><v>1.1138191129e+05</v><v>4.3855997644e+05</v></row>
<!-- 2016-03-24 16:55:00 GMT / 1458838500 --> 
<row><v>1.1547433278e+05</v><v>4.5473071311e+05</v></row>
<!-- 2016-03-24 17:00:00 GMT / 1458838800 --> 
<row><v>NaN</v><v>NaN</v></row>
<!-- 2016-03-24 17:05:00 GMT / 1458839100 --> 
<row><v>NaN</v><v>NaN</v></row>
<!-- 2016-03-24 17:10:00 GMT / 1458839400 --> 
<row><v>NaN</v><v>NaN</v></row>
<!-- 2016-03-24 17:15:00 GMT / 1458839700 --> 
<row><v>NaN</v><v>NaN</v></row>
<!-- 2016-03-24 17:20:00 GMT / 1458840000 --> 
<row><v>NaN</v><v>NaN</v></row>
<!-- 2016-03-24 17:25:00 GMT / 1458840300 --> 
<row><v>NaN</v><v>NaN</v></row>
<!-- 2016-03-24 17:30:00 GMT / 1458840600 --> 
<row><v>NaN</v><v>NaN</v></row>
<!-- 2016-03-24 17:35:00 GMT / 1458840900 --> 
<row><v>NaN</v><v>NaN</v></row>
<!-- 2016-03-24 17:40:00 GMT / 1458841200 --> 
<row><v>NaN</v><v>NaN</v></row>
<!-- 2016-03-24 17:45:00 GMT / 1458841500 --> 
<row><v>NaN</v><v>NaN</v></row>
<!-- 2016-03-24 17:50:00 GMT / 1458841800 --> 
<row><v>NaN</v><v>NaN</v></row>


I am able to use snmpget by hand:

$ snmpget -cmycommstr -v2c myhost ifDescr.4 ifInOctets.4 ifDescr.4 
ifOutOctets.4
IF-MIB::ifDescr.4 = STRING: eth0
IF-MIB::ifInOctets.4 = Counter32: 981294193
IF-MIB::ifDescr.4 = STRING: eth0
IF-MIB::ifOutOctets.4 = Counter32: 1271239781


Enabling all debugging flags for mrtg produced this, which shows some 
obvious errors:

2016-03-25 15:16:17 -- --fork: start clearing confcache on first entry 
for target mycommstr at myhost_
2016-03-25 15:16:17 -- --coca: clear confcache mycommstr at myhost_
2016-03-25 15:16:17 -- --fork: finished clearing confcache
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ 
Descr eth0 --> 4
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ 
Descr eth1 --> 5
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ 
Descr eth2 --> 2
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ 
Descr eth3 --> 3
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ 
Descr lo --> 1
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Eth  
--> 1
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Eth 
00-15-17-9f-93-26 --> 2
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Eth 
00-15-17-9f-93-27 --> 3
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Eth 
00-22-19-98-9e-4f --> 4
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Eth 
00-22-19-98-9e-51 --> 5
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Ip 
10.0.6.1 --> 5
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Ip 
127.0.0.1 --> 1
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Ip 
165.92.30.101 --> 4
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Ip 
165.92.30.51 --> 4
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Name 
eth0 --> 4
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Name 
eth1 --> 5
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Name 
eth2 --> 2
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Name 
eth3 --> 3
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Name 
lo --> 1
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Type 
24 --> 1
2016-03-25 15:16:17 -- --coca: store in confcache mycommstr at myhost_ Type 
6 --> Dup
.
.
.
2016-03-25 15:32:36 -- --base: Check for Thresholds
2016-03-25 15:32:36 -- --base: Act on Router/Target myhost_eth0
2016-03-25 15:32:36 -- 2016-03-25 15:32:35: ERROR: 
Target[myhost_eth0][_IN_] ' $target->[77]{$mode} ' did not eval into 
defined data
2016-03-25 15:32:36 -- 2016-03-25 15:32:35: ERROR: 
Target[myhost_eth0][_OUT_] ' $target->[77]{$mode} ' did not eval into 
defined data
2016-03-25 15:32:36 -- --base: Get Current values: in:undef, out:undef, 
up:undef, name:undef, time:1458918977
2016-03-25 15:32:36 -- --base: Create Graphics
2016-03-25 15:32:36 -- --base: start RRDtool section
2016-03-25 15:32:36 -- --base: maxi:125000000, maxo:125000000
2016-03-25 15:32:36 -- --log: 
RRDs::tune(/netmon/mrtg/rrd/myhost_eth0.rrd -a ds0:125000000 -a 
ds1:125000000 -d ds0:COUNTER -d ds1:COUNTER)
2016-03-25 15:32:36 -- --log: 
RRDs::update(/netmon/mrtg/rrd/myhost_eth0.rrd, '1458918977:U:U')
2016-03-25 15:32:36 -- --log:  got: ???/???

This appears to be the case with every data point.  Does anyone know 
what might be causing this and how to correct it?

Regards,
Brian Rapp




More information about the mrtg mailing list