[rrd-users] Inconsistent results when polling lots of devices

Simon Mullis Simon.Mullis at equinoxsolutions.com
Tue Aug 5 17:39:58 MEST 2003


Hello All,

I wonder if anyone can offer any insight...

I am polling around 1000 devices on a dual Xeon system running Linux
(2.4.18-14smp).  This is split into 40 separate mrtg configuration files
that run concurrently using RRD as the logging format.  I have a web
frontend using read-only NFS to mount the generated rrds on a similar spec
system from which I use a heavily customised version of mrtg-rrd.cgi to
display the data. (Actually, if anyone's interested, using FastCGI
(www.fastcgi.com) with mrtg-rrd.cgi increases the speed of image creation by
a factor of 20!).  Each MRTG process is set to fork 16 times.  The mrtg
config is automatically generated from a database holding all of the config
data for all of the Edge devices (using /bin/sh shell scripts).  There's all
sorts of error-checking built-in for the consistency of the data, to ensure
the right processes are running and every 24 hours it actively checks the
config of the edge devices.  I'm using mrtg-2.9.29 and rrdtool-1.0.43. 

It all works beautifully.

Except: 

Some devices just do not display data when polled with MRTG, yet when I try
to snmpwalk (or get, or getnext) using the correct community / target IP and
port with both SNMPv1 and v2c (ifInOctets & ifHCInOctets) from the data
collection system I get a valid response every time.

I've tried using SNMPv1 and v2c (by appending :::::2 to the target). 
I have tried reducing the forking and increasing the SNMP timeout and
retries (to 5 and 3 respectively) and managed to get a single point of data
on one of the missing graphs but that's all.

The weird thing is the consistency of the problem: 
	It doesn't work 99% of the time with MRTG.
	It works every time with command line snmp tools...

>From the mrtg output:

2003-08-05 16:09:35 -- WARNING: skipping because at least the query for
ifDescr.24 on  X.X.X.X did not succeed
2003-08-05 16:09:35 -- WARNING: no data for
ifInOctets&ifOutOctets:COMMUNITY at X.X.X.X. Skipping further queries for Host
X.X.X.X in this round.
2003-08-05 16:09:35 -- ERROR: Target[X.X.X.X_PORT][_IN_] '
$$target[13]{$mode} ' evaluated to 'DEADHOST' instead of a number
2003-08-05 16:09:35 -- ERROR: Target[X.X.X.X_PORT][_OUT_] '
$$target[13]{$mode} ' did not eval into defined data


Packet Loss? Bug in 2.9.29? (Sorry, Tobi)

Thanks in advance for any ideas.

Kind Regards

SM


------------------------------------------------------------------------------------------
Equinox Converged Solutions Limited.
Tel: +44 (0)1252 405 600
http://www.equinoxsolutions.com

IMPORTANT NOTICE:
This message is intended solely for the use of the Individual or organisation to whom it is addressed. It may contain privileged or confidential information.  If you have received this message in error, please notify the originator immediately.
If you are not the intended recipient, you should not use, copy, alter, or disclose the contents of this message.  All information or opinions expressed in this message and/or any attachments are those of the author and are not necessarily those of Equinox Converged Solutions Limited.
Equinox Converged Solutions Limited accepts no responsibility  for loss or damage arising from its use, including damage from virus.
------------------------------------------------------------------------------------------

--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-users
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the rrd-users mailing list