[rrd-developers] [rrd] Re: rrdcached dies when journal reaches 2G

kevin brintnall kbrint at rufus.net
Fri Jul 17 22:18:14 CEST 2009


>> For large file support, we need to consider the things that link against
>> librrd.  We can either require all dependents (transitively) to enable
>> large file support (bad) or we could remove any functions that use off_t
>> from the public librrd interface.  It looks like all such functions are
>> already deprecated.

This doesn't look like a problem...  Also, perhaps 1.4 is a good time to
remove these functions from the public-facing API altogether??

> > In the end, it's probably better to just rotate the journals before 2^31
> > bytes.  That way, we can support systems that do not have large file
> > support.  There are some implicit assumptions during journal replay that
> > I'll have to take a look at.
> 
> rotation sounds good ... the current rrd code can not deal with
> files larger 2gb on 32 bit system reliably anyway ...

I've started to implement both options..  The large file support is dead
simple, whereas the split journal support is much more complicated.  There
are three combinations:

 (1) go with large file support
     (1a) OS supports it
     (1b) OS doesn't support it

 (2) split journal

In cases (1b) and (2), we need to watch the journal size as we write to
it, and roll-over to a new file once the current file gets too large.  For
(1b), a full tree flush is required.

The forced flush requirement for (1b) doesn't seem too onerous when we
consider that a journal limited to 2^31 bytes will nearly always be on a
system whose per-process memory space is limited to 2^32 bytes (or
smaller).  Under these conditions, there are only a couple cases where
rrdcached will not run out of memory anyway.  If (small -w), then the
forced flush won't have many values to write.  If (small -f), then the
flush would only be twice as often as normal in the worst case.

I'm looking for some old OS's without large file support to test with, but
I'm having a hard time finding one.  Perhaps that's a good sign.  Do you
have any demographic info on the RRD install base?

I'm leaning strongly towards (1) after seeing the implications on the
code...

-- 
 kevin brintnall =~ /kbrint at rufus.net/



More information about the rrd-developers mailing list