[smokeping-users] unusual slave behavior

Bill Simon bills at psu.edu
Fri Dec 7 16:50:10 CET 2007


Me again... Tobi helped me solve the issue I was having with slave_cache 
files not being read due to file locking problems.  This was a 
Solaris-specific problem where perl's flock() function behaves 
differently than on Linux.

NOW - a new issue.  I have a master and three slaves set up.  No errors 
are reported in the logs on the master or any slave at startup or during 
run.  But some slave RRDs are not updating.  It seems random.  If I shut 
down all smokeping processes on the four servers and restart, a 
different slave fails.  See these timestamps for example:

Target = "USBMCS"

-rw-r--r--   1 apache   other    2986708 Dec  7 10:32 USBMCS.rrd
-rw-r--r--   1 apache   other    2986708 Dec  7 10:16 
USBMCS~sp-tb-analog1.phone.psu.edu.rrd
-rw-r--r--   1 apache   other    2986708 Dec  7 10:32 
USBMCS~sp-usb2-core.phone.psu.edu.rrd

note that slave "sp-tb-analog1.phone.psu.edu" is stuck at 10:16 but the 
other two nodes are updating.

But on a different target = "Harmony"

-rw-r--r--   1 apache   other    2986708 Dec  7 10:34 Harmony.rrd
-rw-r--r--   1 apache   other    2986708 Dec  7 10:34 
Harmony~sp-tb-analog1.phone.psu.edu.rrd
-rw-r--r--   1 apache   other    2986708 Dec  7 10:10 
Harmony~sp-tb-core.phone.psu.edu.rrd

in this case the master and "sp-tb-analog1" are updating ok but 
"sp-tb-core" does not!

What could be going on here?




More information about the smokeping-users mailing list