[rrd-developers] rrdcached shutdown

Florian Forster rrdtool at nospam.verplant.org
Fri Sep 26 09:24:05 CEST 2008


Hi Tobi,

On Thu, Sep 25, 2008 at 11:43:48PM +0200, Tobias Oetiker wrote:
> the point of using TERM is that as a system goes down, normally
> processes that are still hanging around are sent TERM and shortly
> after KILL, so it is a good thing for a process to quickly get ready
> to die when he gets TERM.

yes, but before any of that SIGTERM/SIGKILL business, the init scripts
are run. And in the init script you can do something like:
 . /etc/default/rrdtool
 if test "$FLUSH_ON_EXIT" -eq 0
 then
   kill -USR1 `pidof rrdcached` 
 else
   kill -TERM `pidof rrdcached`
 fi

Additionally an init script could provide two stop actions:
 # /etc/init.d/rrdcached stop
 # /etc/init.d/rrdcached stop-noflush
(And appropriate restart actions, of course.)

> also when a user does kill PID the process should die and not suddenly
> start using the disk like mad for 20 minutes ... if it does that the
> user will send it a kill -KILL and this may not be what we want at all
> ...

No, when a user does a `kill <pid>', a *daemon* should catch the signal
and *shut down gracefully*. Don't let the name of the tool misguide you,
think `sendsignal' instead ;)

If the user wants to use the ``fast exit'' feature, he can use
`SIGUSR1'. If he did a mistake and sent `SIGTERM' first, it should be
possible to send a `SIGUSR1' after that and have the daemon exit quickly
after that.

If the user really wants to *kill* the process, potentially losing data
and destroying sensible data structures, he can use the `KILL' signal,
but should take into account that this can screw things up. (By the way,
does anyone know if killing a process during the update phase can render
RRD files unusable?)

So the user has the choice and an IO-intensive mistake is easily
corrected. I hope nobody is ever going to ask questions along the lines
of ``I'm collecting data for 500k RRD files and suddenly my disks are
busy, what's up?''.. (And if this happens, it merely shows we did a good
job and the user didn't read the documentation ;)

If you don't like to use `SIGUSR1' or `SIGUSR2' for the ``exit right
now'' feature, what about one of the other numerous unused signals, for
example `SIGQUIT'? I think most people associate a functionality between
the normal, graceful shutdown of `SIGTERM' and the absolute, brutal and
immediate termination of `SIGKILL' with this signal..

As I said before: I dislike using `SIGINT' for this purpose, because
there is no much difference between `SIGINT' and `SIGTERM'. For me, INT
is TERM sent from the keyboard ;)

Regards,
-octo
-- 
Florian octo Forster
Hacker in training
GnuPG: 0x91523C3D
http://verplant.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: Digital signature
Url : http://lists.oetiker.ch/pipermail/rrd-developers/attachments/20080926/eb5ebac3/attachment.bin 


More information about the rrd-developers mailing list