[rrd-developers] rrdcached problem complex RRD options on Solaris

Peter Jenkins peter.jenkins at csc.fi
Fri Jul 30 08:42:23 CEST 2010


Hi,

> Your client is receiving the status message back from the first RRD ("0 Successfully...").
> Then, for some reason, it's echoing this back to the server before the next command.
> You can see it on the read() for FD 6.
>
>     1401/19:       327      57     30 write(0x6, "0 Successfully flushed
> /opt/rrd/ganglia/Management/shango/load_one.rrd.\n\0", 0x48)              = 72 0
>       1401/19:       394      29     12 read(0x6, "0 Successfully flushed
> /opt/rrd/ganglia/Management/shango/load_one.rrd.\nflush
> /opt/rrd/ganglia/Management/shango/proc_run.rrd\na\320\0", 0x2000)     = 126 0
>       1401/19:       452      46     27 write(0x6, "-1 Unknown command: 0\n\0", 0x16)                = 22 0
>
> Could you try tracing on your client to see if it's actually sending the results back to the server as a command?  Then we'll know which side to look on.

Great debuging idea. The client (rrdtool) is the problem:

  19442/1: 
resolvepath("/opt/rrd/ganglia/Management/__SummaryInfo__/load_one.rrd\0", 
0xFFBFBA90, 0x400)          = 56 0
  19442/1:  fstat64(0x3, 0xFFBFB7F0, 0x1)                 = 0 0
  19442/1:  brk(0x62918)          = 0 0
  19442/1:  brk(0x64918)          = 0 0
  19442/1:  fstat64(0x3, 0xFFBFB698, 0xD9D64)             = 0 0
  19442/1:  ioctl(0x3, 0x5401, 0xFFBFB77C)                = -1 Err#22
  19442/1:  write(0x3, "flush 
/opt/rrd/ganglia/Management/__SummaryInfo__/load_one.rrd\n\0", 0x3F) 
     = 63 0
  19442/1:  read(0x3, "0 Successfully flushed 
/opt/rrd/ganglia/Management/__SummaryInfo__/load_one.rrd.\n\0", 0x2000) 
     = 81 0
  19442/1:  open("/opt/rrd/ganglia/Management/__SummaryInfo__/load_one.rrd\0", 
0x0, 0x1B6)                = 5 0
  19442/1:  fstat(0x5, 0xFFBFCD08, 0x0)           = 0 0
  19442/1:  mmap(0x0, 0x5C18, 0x1)                = -31850496 0
  19442/1:  memcntl(0xFE1A0000, 0x5C18, 0x4)              = 0 0
  19442/1:  memcntl(0xFE1A0000, 0x78, 0x4)                = 0 0
  19442/1:  memcntl(0xFE1A0000, 0x78, 0x4)                = 0 0
  19442/1:  memcntl(0xFE1A0000, 0xF0, 0x4)                = 0 0
  19442/1:  memcntl(0xFE1A0000, 0x230, 0x4)               = 0 0
  19442/1:  memcntl(0xFE1A0000, 0x8, 0x4)                 = 0 0
  19442/1:  memcntl(0xFE1A0000, 0x5C18, 0x1)              = 0 0
  19442/1:  munmap(0xFE1A0000, 0x5C18)            = 0 0
  19442/1:  close(0x5)            = 0 0
  19442/1: 
resolvepath("/opt/rrd/ganglia/Management/__SummaryInfo__/proc_run.rrd\0", 
0xFFBFBA90, 0x400)          = 56 0
  19442/1:  write(0x3, "0 Successfully flushed 
/opt/rrd/ganglia/Management/__SummaryInfo__/load_one.rrd.\n\0", 0x51) 
    = 81 0
  19442/1:  write(0x3, "flush 
/opt/rrd/ganglia/Management/__SummaryInfo__/proc_run.rrd\n\0", 0x3F) 
     = 63 0
  19442/1:  read(0x3, "-1 Unknown command: 
0\n/Management/__SummaryInfo__/proc_run.rrd\no__/load_one.rrd.\n\0", 0x2000) 
        = 22 0
  19442/1:  fstat64(0x2, 0xFFBFE780, 0xFE5303BC)          = 0 0
  19442/1:  write(0x2, "ERROR: \0", 0x7)          = 7 0
  19442/1:  write(0x2, "rrdc_flush 
(/opt/rrd/ganglia/Management/__SummaryInfo__/proc_run.rrd) failed with status 
-1.\0", 0x5C)            = 92 0
  19442/1:  write(0x2, "\n\0", 0x1)               = 1 0

I guess the bug is in rrd_client.c or rrd_graph.c, but I can't see why this 
would break on Solaris and not Linux.

Anything else I can test?

Many thanks,
Peter.



More information about the rrd-developers mailing list