[rrd-users] rrdupdate corruption on Mac Snow Leopard
JohnGarney
public at garneys.com
Thu May 17 20:45:42 CEST 2012
I've been using RRDTool for years on a linux box (redhat). I collect various
house sensor data. However, that system just went belly up, so I moved the
collection applications to a mac mini. I used a bundled version of rrdtool
1.4.5 that had various other libraries collected to make the mac build work
statically linked together. With minor build problems, I have rrdtool
working.
I have two C apps that receive transmissions from my (temperature) sensors
in the house and log to a GAUGE rrd. The databases are on a linux server
that the mac accesses via NFS. This was the same networked file system
setup when I used the linux box that died.
Frequently, the two apps (using different radio receivers) will receive the
same sensor "report" and (nearly) simultaneously attempt to rrdupdate (via C
call) the value for that time. One of them will typically get either a
"could not lock" or "illegal attempt to update using time" rrd error. Which
is fine in my case. I just ignore those errors, do a rrd_clear_error and go
on. I use the two apps (each with its own receiver) due to the range of the
sensor transmitters. With only one receiver, I don't always get all the
sensor reports. So I can't just use one receiver/app long term.
However, on my old linux box, data would get stored just fine. On the mac
mini, I see sporadic cases where the data that is stored is not the data
that is retrieved. For example, temperatures might be in the 65 degree
range and a value in the 15 degree range is present. When I print out the
actual rrdupdate strings that I use during running of the apps, I only see
the normal expected temperatures being rrdupdated. But when I do an
rrdfetch or display a graph, I find these "out of expected range" values.
Several values a day, but usually hours apart for a given sensor. I log
different sensors to different rrds and see these errors on different rrds.
I get maybe a dozen such errors across all the sensors during a day.
If I only let one of the two logging apps run, I don't see these "glitches".
So I suspect some sort of locking problem is happening.
I am using the same app code, same sensors/receivers, same rrd data bases.
I didn't re-create the rrds, just used the ones that I have been using. But
the platform is different (mac os snow leopard vs. redhat), different
rrdtool versions 1.4.5 vs. some older version. I don't know what older
rrdtool version due to the redhat system's disk failure. That system had
been running for years with no rrdtool problems and I don't recall what the
rrdtool version was that I used. But I believe it was a binary install via
yum.
I looked at the source and don't see any obvious errors in the fcntl call
that does the rrd_lock.
I tried (on a whim) changing the fcntl to instead do a flock(), but no
change.
I verified that I am using the rrd_lock() code clause that isn't the one
that uses _locking().
I also verified that I am not using the _rrd_update() code clause(s) that
depend on HAVE_MMAP.
Any ideas of what to look for or try to eliminate this problem? This makes
it really hard to get useful/reliable data that I have come to depend upon!
Thanks!
--
View this message in context: http://rrd-mailinglists.937164.n2.nabble.com/rrdupdate-corruption-on-Mac-Snow-Leopard-tp7564325.html
Sent from the RRDtool Users Mailinglist mailing list archive at Nabble.com.
More information about the rrd-users
mailing list