[rrd-users] pipe mode deadlock

Damien BENOIST damien.benoist at mail.com
Mon Jan 29 17:12:59 CET 2007


Hello,

I'm using rrdtool pipe mode with a ksh script.
rrdtool and the script both end up blocked, writing to each other
through the pipe.

The script loops, in each iteration it:
- writes a command to the pipe (1 line)
- reads 1 line from the pipe

As the documentation on the web site says:
When a command is completed, RRDtool will print the string 'OK', ...
If an error occurs, a line of the form 'ERROR: Description of error'
will be printed instead

But it seems that rrdtool sometimes sends more than 1 line in response
to a command.
The extra lines are accumulating in the pipe, until the pipe is full.
Then both the script and rrdtool are blocked when they try to write more data
to the pipe.

If I remember C, a OK message is always sent after a command, no matter
wether it previously sent an error message or not.

So what is the protocol to communicate with rrdtool in pipe mode?
Do I have to use a newer version of rrdtool?

Thanks for your help.


# rrdtool -v
RRDtool 1.0.48  Copyright 1997-2004 by Tobias Oetiker <tobi at oetiker.ch>

# uname -a
SunOS bdv02 5.8 Generic_108528-13 sun4u sparc SUNW,UltraAX-i2

The following function handles the comunication with rrdtool:

rrdt()
{
        print -p "$@"
        read -p msg
        case $msg in
                OK*)            return 0
                                ;;
                ERROR*)         echo $msg "($@)"
                                return 1;
                                ;;
                *)              echo unhandled rrdtool message: $msg
                                ;;
        esac
        echo respowning rrdtool
        $rrdtool - |&
        return 1
}


state of the script:

# truss -p [scriptPid]
write(58, " u p d a t e   - t   k o".., 96) (sleeping...)

# pfiles [scriptPid]
[scriptPid]:   /bin/ksh stat.ksh
  Current rlimit: 1024 file descriptors
   0: S_IFIFO mode:0000 dev:254,0 ino:788 uid:1139 gid:200 size:9430
      O_RDWR
   1: S_IFIFO mode:0000 dev:254,0 ino:789 uid:1139 gid:200 size:0
      O_RDWR
   2: S_IFIFO mode:0000 dev:254,0 ino:790 uid:1139 gid:200 size:0
      O_RDWR
  58: S_IFIFO mode:0000 dev:254,0 ino:803 uid:1139 gid:200 size:0
      O_RDWR FD_CLOEXEC
  59: S_IFIFO mode:0000 dev:254,0 ino:802 uid:1139 gid:200 size:0
      O_RDWR FD_CLOEXEC
  60: S_IFIFO mode:0000 dev:254,0 ino:802 uid:1139 gid:200 size:9299
      O_RDWR FD_CLOEXEC
  61: S_IFREG mode:0600 dev:0,2 ino:6743166 uid:1139 gid:200 size:0
      O_RDWR|O_LARGEFILE FD_CLOEXEC
  62: S_IFREG mode:0744 dev:261,9 ino:789123 uid:1139 gid:200 size:8333
      O_RDONLY|O_LARGEFILE FD_CLOEXEC

# truss -x write -p [scriptPid]        # get the pointer to the message
write(58, 0x000532C0, 96)       (sleeping...)

# gcore -o ksh.core [scriptPid]        # dump the process
gcore: ksh.core.[scriptPid] dumped

# dbx ksh ksh.core.[scriptPid]     # get the full message
(/usr/sparcworks2.6/bin/dbx) print ((char*)0x000532C0)
(char *) 340672 = 0x532c0 "update -t kos rrd/SV/SVxxxxxxx.xxxx-xxxxxxx-ACCES/HTs
cnDispCnx-xxxx-xxxxxxx-R1.rrd 1169449740:8\nmum one second step) (update -t kos
rrd/SV/xxxxx.xxxx-xxx-ACCES/HTscnDispReq-xxxx-xxx-R1.rrd 1169449740:18)\n)\n4974
0:63)\n\n"


state of rrdtool:

# truss -p [rrdtoolPid]
write(1, " E R R O R :   i l l e g".., 143) (sleeping...)

# pfiles [rrdtoolPid]
[rrdtoolPid]:   rrdtool -
  Current rlimit: 1024 file descriptors
   0: S_IFIFO mode:0000 dev:254,0 ino:803 uid:1139 gid:200 size:9269
      O_RDWR
   1: S_IFIFO mode:0000 dev:254,0 ino:802 uid:1139 gid:200 size:0
      O_RDWR
   2: S_IFIFO mode:0000 dev:254,0 ino:790 uid:1139 gid:200 size:0
      O_RDWR

# truss -x write -p [rrdtoolPid]        # get the message pointer
write(1, 0x0010ABE4, 143)       (sleeping...)

# gcore -o rrd.core [rrdtoolPid]        # dump the process
gcore: rrd.core.[rrdtoolPid] dumped

# dbx rrdtool rrd.core.[rrdtoolPid]     # get the full message
...
Current function is main
  258                   fflush(stdout); /* this is important for pipes to work *
/
(/usr/sparcworks2.6/bin/dbx) print ((char*)0x0010ABE4)
(char *) 1092580 = 0x10abe4 "ERROR: illegal attempt to update using time 1169449
501 when last update time is 1169450951 (minimum one second step)\nOK u:0.76 s:1
.87 r:213.49\n"

Here rrdtool is sending 2 messages (2 lines) at a time,
the error message followed by the ok message.


=
MLS Search for Homes in Orange County CA
Ruben specializes in Anaheim Hills and Yorba Linda but keeps fully informed on all real estate offerings in the entire county of Orange.
http://a8-asy.a8ww.net/a8-ads/adftrclick?redirectid=75c11a10d7c03f912f3bde89b91a8d6c



More information about the rrd-users mailing list