[smokeping-users] unusual slave behavior
Bill Simon
bills at psu.edu
Fri Dec 7 16:50:10 CET 2007
Me again... Tobi helped me solve the issue I was having with slave_cache
files not being read due to file locking problems. This was a
Solaris-specific problem where perl's flock() function behaves
differently than on Linux.
NOW - a new issue. I have a master and three slaves set up. No errors
are reported in the logs on the master or any slave at startup or during
run. But some slave RRDs are not updating. It seems random. If I shut
down all smokeping processes on the four servers and restart, a
different slave fails. See these timestamps for example:
Target = "USBMCS"
-rw-r--r-- 1 apache other 2986708 Dec 7 10:32 USBMCS.rrd
-rw-r--r-- 1 apache other 2986708 Dec 7 10:16
USBMCS~sp-tb-analog1.phone.psu.edu.rrd
-rw-r--r-- 1 apache other 2986708 Dec 7 10:32
USBMCS~sp-usb2-core.phone.psu.edu.rrd
note that slave "sp-tb-analog1.phone.psu.edu" is stuck at 10:16 but the
other two nodes are updating.
But on a different target = "Harmony"
-rw-r--r-- 1 apache other 2986708 Dec 7 10:34 Harmony.rrd
-rw-r--r-- 1 apache other 2986708 Dec 7 10:34
Harmony~sp-tb-analog1.phone.psu.edu.rrd
-rw-r--r-- 1 apache other 2986708 Dec 7 10:10
Harmony~sp-tb-core.phone.psu.edu.rrd
in this case the master and "sp-tb-analog1" are updating ok but
"sp-tb-core" does not!
What could be going on here?
More information about the smokeping-users
mailing list