[rrd-developers] Re: Multi-Threaded JNI code
Robert Halstead
badbeeker at gmail.com
Thu Aug 3 19:58:30 MEST 2006
Hi Peter,
It's been a while since we last exchanged emails. I'm getting some
wierdness from my multi-threaded java program.
I'm running 10 threads that updates 24,000+ rrd files. Every once in a
while, I can't close the program because one of my threads are hanging on a
lock of somekind and i'm not exactly sure why. First off, I'm updating rrd
files over a nfs mount and really this started happening after I made the
switch to nfs from local disc. NFS is a must, so I need to get this
working. I noticed, while I was encountering this problem, that when I did
a lsof, I saw the same file listed multiple times. Anywhere from 3 to 8
times during the same period. The way I programmed the java is to assign
all the updates from 1 rrd file to 1 thread. That is, each rrd file should
not see access from different threads, but only the same thread. Now, just
to debug, I limited myself to 1 thread.
Here's a dump of: ` lsof -a -N -u rhalstead -r 1`
COMMAND PID USER FD TYPE DEVICE SIZE/OFF
NODE NAME
java 7873 rhalstead 12u VREG 255,117440514 375576
18446744073490300471
/mnt/nms2/test/hlnmt001acm1/hlnmt001acm1_errors_in_3820.rrd
java 7873 rhalstead 12u VREG 255,117440514 375576
18446744073490300471
/mnt/nms2/test/hlnmt001acm1/hlnmt001acm1_errors_in_3820.rrd
java 7873 rhalstead 12u VREG 255,117440514 375576
18446744073490300471
/mnt/nms2/test/hlnmt001acm1/hlnmt001acm1_errors_in_3820.rrd
COMMAND PID USER FD TYPE DEVICE SIZE/OFF
NODE NAME
java 8261 rhalstead txt VREG 255,117440514 188744
1497859076 /mnt/nms2/test/codwy001hb6_traffic_in_33680.rrd
java 8261 rhalstead 4u VREG 255,117440514 188744
1497859076 /mnt/nms2/test/codwy001hb6_traffic_in_33680.rrd
java 8261 rhalstead txt VREG 255,117440514 188744
1497859076 /mnt/nms2/test/codwy001hb6_traffic_in_33680.rrd
java 8261 rhalstead 4u VREG 255,117440514 188744
1497859076 /mnt/nms2/test/codwy001hb6_traffic_in_33680.rrd
java 8261 rhalstead txt VREG 255,117440514 188744
1497859076 /mnt/nms2/test/codwy001hb6_traffic_in_33680.rrd
java 8261 rhalstead 4u VREG 255,117440514 188744
1497859076 /mnt/nms2/test/codwy001hb6_traffic_in_33680.rrd
Now, on the java side of things, I'm not opening the file at all. I do a
FILE.ifExists() before any thing to see if I need to call rrd_create_r(),
but that's it. Durring the whole process of writing an rrd file, I first
see if I need to call rrd_create_r(). Then I do a rrd_last_r() call, then
finally rrd_update_r(). My question, why am I showing multiple filehandle's
to the same file? I'm not entirely sure how lsof works, if it take a window
snapshot or not, maybe you know more about that, but should rrdtool really
only have one filehand open at a time?
I coded the java program to hang if it can't close any of it's own threads
and this is exactly what happens. When I encounter this, I do a lsof and
see that I still have multiple listings for the same file in lsof, but I
can't really tell if it's waiting for another file lock, or locking on
something else. I'm going to run my program using 1 thread and see if it
hangs again. The hanging is entirely random, which is making it hard to
debug the cause.
Could you give me any insight on why i'm seeing multiple's in lsof? Or if
rrdtool is opening multiple filehandles to the same file, why? I am
wondering about the file lock's as well and how that is handled.
Thanks for your help Peter, this issue has been kicking my @ss all week!
--
"A fool acts, regardless; knowing well that he is wrong. The ignoramus acts
on only what he knows, but all that he knows.
The ignoramus may be saved, but the fool knows that he is doomed."
Robert Halstead
--
Unsubscribe mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
Archive http://lists.ee.ethz.ch/rrd-developers
WebAdmin http://lists.ee.ethz.ch/lsg2.cgi
More information about the rrd-developers
mailing list