[rrd-developers] Strange lockup issue

Christopher Snell cjs+lists at aol.net
Wed Dec 5 21:42:47 MET 2001


  Hi All,

I'm having a problem with rrdgraph locking up in certain scenarios. 
 This happens when I am making a graph that pulls data from many 
different RRDs.  The problem lies with a few of my RRD files (of which I 
have 3,000+).  For some reason the lockups only happen when these RRD 
are part of the graph (ie, in a DEF).  If I make the graph without one 
of these "broken" RRD, everything works properly.  If I include a broken 
RRD, however, it locks up cold.  Strangely enough, if I graph that 
supposedly broken RRD by itself, not part of a CDEF, and without other 
DEFs, it works just fine--no lockups.  I tried replacing the broken RRDs 
with fresh, clean, empty RRDs and this fixed the problem but I would 
lose all of my historical data.  I tried dumping to XML and restoring 
the suspect RRDs but this did not stop up the lockups.    Here is an 
example of a rrdgraph call that I'm making that locks up:

rrdtool graph 
/mon/htdocs_devel/images/cached_graphs/aggregated_weekly_956_hw_3.png \
  --title "IO - Quantitative" \
  -a PNG --vertical-label "KBytes per Second" \
  --units-exponent -1 --width 500 --height 230
  --start 1006814095 --end 1007424055 \
  DEF:iowks_7563=/mon/rrd_files/stats_7563.rrd:iowks:AVERAGE \
  DEF:iowks_7556=/mon/rrd_files/stats_7556.rrd:iowks:AVERAGE \
  DEF:iowks_7564=/mon/rrd_files/stats_7564.rrd:iowks:AVERAGE \
  DEF:iowks_7565=/mon/rrd_files/stats_7565.rrd:iowks:AVERAGE \
  DEF:iowks_7557=/mon/rrd_files/stats_7557.rrd:iowks:AVERAGE \
  DEF:iowks_7558=/mon/rrd_files/stats_7558.rrd:iowks:AVERAGE \
  DEF:iowks_7559=/mon/rrd_files/stats_7559.rrd:iowks:AVERAGE \
  DEF:iowks_7560=/mon/rrd_files/stats_7560.rrd:iowks:AVERAGE \
  DEF:iowks_7561=/mon/rrd_files/stats_7561.rrd:iowks:AVERAGE \
  DEF:iowks_7562=/mon/rrd_files/stats_7562.rrd:iowks:AVERAGE \
  CDEF:total_iowks=iowks_7563,iowks_7556,iowks_7564,iowks_7564,iowks_7565,iowks_7557,iowks_7558,iowks_7559,iowks_7560,iowks_7561,iowks_7562,+,+,+,+,+,+,+,+,+,+,+ 
\
  LINE2:total_iowks#cc0000:"Total IO - KBytes Written/second"   

[NOTE: This is a sample RRD call that results in a lockup.  The actual 
calls which we are using are much longer than this one (too big to 
e-mail) but are basically doing the same thing as this one]

This happens when running RRDtool on a Solaris 2.8 machine.  I have not 
tested this on any other architecture.

To help debug this, I have two snippets of truss output.  The first was 
captured from rrdgraph right before it locked up right before it reads 
the suspected broken RRD.  Note the many calls to brk().
The second capture is from a successful read of a non-broken RRD.

open("/mon/rrd_files/stats_7562.rrd", O_RDONLY) = 3
fstat64(3, 0xFFBEBE68)                          = 0
ioctl(3, TCGETA, 0xFFBEBDF4)                    Err#25 ENOTTY
read(3, " R R D\0 0 0 0 1\0\0\0\0".., 8192)     = 8192
read(3, "\0\0\0\0 @ "\0\0\0\0\0\0".., 8192)     = 8192
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192)     = 8192
brk(0x0017DCA8)                                 = 0  
brk(0x0019BCA8)                                 = 0  
llseek(3, 0, SEEK_CUR)                          = 24576
lseek(3, 979824, SEEK_SET)                      = 979824
read(3, "7FFFFFFFFFFFFFFF7FFFFFFF".., 8192)     = 8192
read(3, "7FFFFFFFFFFFFFFF7FFFFFFF".., 8192)     = 8192
read(3, "7FFFFFFFFFFFFFFF7FFFFFFF".., 8192)     = 8192
lseek(3, 506952, SEEK_SET)                      = 506952
read(3, "7FFFFFFFFFFFFFFF7FFFFFFF".., 8192)     = 8192
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192)     = 8192
read(3, " @1FAEEEEEEEEEEF ?B1EB85".., 8192)     = 8192
read(3, " @ X @\0\0\0\0\0 @\0\0\0".., 8192)     = 8192 
read(3, "\0\0\0\0\0\0\0\0 ?C88888".., 8192)     = 8192
read(3, " @\0\0\0\0\0\0\0 @ ,\b88".., 8192)     = 8192
read(3, " @ Y85 U U U U U @1F ?FF".., 8192)     = 8192
read(3, " ?E8888888888889 @ X :AA".., 8192)     = 8192
read(3, " A 9 AC5 U U U U\0\0\0\0".., 8192)     = 8192
read(3, " @1D16 /C9 bFC97 @\0\0\0".., 8192)     = 8192
read(3, "\0\0\0\0\0\0\0\0 @ Y8AAA".., 8192)     = 8192
read(3, " @ \ h _92C5F9 + ?EBBBBB".., 8192)     = 8192
read(3, " ?B1 ~ K17E4B17F A 8E0 X".., 8192)     = 8192
llseek(3, 0xFFFFFFFFFFFFE238, SEEK_CUR)         = 605824
close(3)                                        = 0
brk(0x0019BCA8)                                 = 0
brk(0x0019DCA8)                                 = 0
brk(0x0019DCA8)                                 = 0
brk(0x0019FCA8)                                 = 0
brk(0x0019FCA8)                                 = 0
brk(0x001A1CA8)                                 = 0
brk(0x001A1CA8)                                 = 0
brk(0x001A3CA8)                                 = 0
brk(0x001A3CA8)                                 = 0
brk(0x001A5CA8)                                 = 0
brk(0x001A5CA8)                                 = 0
brk(0x001A7CA8)                                 = 0
brk(0x001A7CA8)                                 = 0
brk(0x001A9CA8)                                 = 0
brk(0x001A9CA8)                                 = 0
brk(0x001ABCA8)                                 = 0
brk(0x001ABCA8)                                 = 0
brk(0x001ADCA8)                                 = 0
brk(0x001ADCA8)                                 = 0
brk(0x001AFCA8)                                 = 0
brk(0x001AFCA8)                                 = 0
brk(0x001B1CA8)                                 = 0
brk(0x001B1CA8)                                 = 0
brk(0x001B3CA8)                                 = 0
brk(0x001B3CA8)                                 = 0
brk(0x001B5CA8)                                 = 0
brk(0x001B5CA8)                                 = 0
brk(0x001B7CA8)                                 = 0
brk(0x001B7CA8)                                 = 0
brk(0x001B9CA8)                                 = 0
brk(0x001B9CA8)                                 = 0
brk(0x001BBCA8)                                 = 0
brk(0x001BBCA8)                                 = 0
brk(0x001BDCA8)                                 = 0
brk(0x001BDCA8)                                 = 0
brk(0x001BFCA8)                                 = 0
brk(0x001BFCA8)                                 = 0
brk(0x001C1CA8)                                 = 0
brk(0x001C1CA8)                                 = 0
brk(0x001C3CA8)                                 = 0
brk(0x001C3CA8)                                 = 0
brk(0x001C5CA8)                                 = 0
brk(0x001C5CA8)                                 = 0
brk(0x001C7CA8)                                 = 0


and here is the successful read:

open("/mon/rrd_files/stats_7563.rrd", O_RDONLY) = 3
fstat64(3, 0xFFBEBE68)                          = 0
brk(0x00077CA8)                                 = 0
brk(0x00079CA8)                                 = 0
ioctl(3, TCGETA, 0xFFBEBDF4)                    Err#25 ENOTTY
read(3, " R R D\0 0 0 0 1\0\0\0\0".., 8192)     = 8192
brk(0x00079CA8)                                 = 0
brk(0x0007DCA8)                                 = 0
read(3, "\0\0\0\0 @ "\0\0\0\0\0\0".., 8192)     = 8192
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192)     = 8192
brk(0x0007DCA8)                                 = 0
brk(0x00099CA8)                                 = 0
llseek(3, 0, SEEK_CUR)                          = 24576
lseek(3, 1101952, SEEK_SET)                     = 1101952
lseek(3, 612352, SEEK_SET)                      = 612352
read(3, "7FFFFFFFFFFFFFFF7FFFFFFF".., 8192)     = 8192
read(3, "7FFFFFFFFFFFFFFF7FFFFFFF".., 8192)     = 8192
read(3, "7FFFFFFFFFFFFFFF7FFFFFFF".., 8192)     = 8192
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192)     = 8192
read(3, "\0\0\0\0\0\0\0\0 ?CAAAAA".., 8192)     = 8192
read(3, " @\0\0\0\0\0\0\0 @ *AEEE".., 8192)     = 8192
read(3, " @ Y80\0\0\0\0\0 @  84 D".., 8192)     = 8192
read(3, " ?ECCCCCCCCCCCCF @ X10\0".., 8192)     = 8192
read(3, " A 8F1B4AAAAAAAB\0\0\0\0".., 8192)     = 8192
read(3, " @   (1B N81B4E8 @\0\0\0".., 8192)     = 8192
read(3, "\0\0\0\0\0\0\0\0 @ Y80\0".., 8192)     = 8192
read(3, " @ ^CCB1 ~ K17E5 ?EDDDDD".., 8192)     = 8192
read(3, " ?A8BF %8BF2 XBF A 9 O T".., 8192)     = 8192
read(3, " ?A7E4B1 ~ K17E4 @1F x Q".., 8192)     = 8192
read(3, "\0\0\0\0\0\0\0\0\0\0\0\0".., 8192)     = 8192
llseek(3, 0xFFFFFFFFFFFFE390, SEEK_CUR)         = 727952
close(3)                                        = 0


Any ideas?  I'm not very good with gdb but if somebody can tell me what 
to do, I can run some gdb traces.

Chris 


--
Unsubscribe mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-developers
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the rrd-developers mailing list