[rrd-developers] rrdcached + collectd issues

Sat Oct 10 03:04:15 CEST 2009

On Fri, Oct 09, 2009 at 04:41:55PM -0700, Thorsten von Eicken wrote:
> > It looks like your RRD files must have a very large number of DS?  Almost
> > 400?
> 
> No, in fact most have 1 (that's how collectd likes it). You  may be 
> looking at the recevied/written ratio which is skewed by journal replay 
> and the very long cache period I configure (1 hour).

You're right..  that's an indicator of how many values are cached, not how
many DS's there are...  It's been a while since I looked at the stats code.

> Yes, the question is whether it's collectd's fault or rrdcached's fault..

The protocol is such that the client should wait for the server response
to continue...  there is no notion of "pipelined" operation, except BATCH
mode.

> >> The journal replay is too slow. When I terminate the daemon it leaves 
> >> several GB of journal files behind. Reading those in takes the better 
> >> part of an hour, during which the daemon is unresponsive.
> > 
> > That seems awfully long.  Mine is able to replay 83M entries (about 8GB)
> > in 7 minutes.
> 
> One thing I noticed is that it starts out quite fast and gradually slows 
> down. I may be able to run a more controlled experiment. The more I 
> think about it, I have the feeling some rrdcached data structure is 
> getting slower and slower as it grows. How big is your process? As I 
> mentioned, mine is 0.8GB.

Mine is steady at 1.2GB.  rrdcached will realloc() the cache_item_t.values
on every update.  If your realloc() implementation is slow, that could
cause the decaying performance.

> >> Most of time is in buffer_get_field. (Note: in the most common cases
> >> buffer_get_field copies each field in-place, character by
> >> character. Seems to me that a simple if statement could avoid the
> >> writes.)
> > 
> > Agreed that buffer_get_field implementation is not optimal.  From what I
> > can tell, it copies this way for three reasons:
> > 
> > (1) desire to provide null terminated string to caller
> > (2) do not want to modify original string (in case need to write it to journal)
> > (3) allow escaped characters (presumably space)
> 
> Mhh, I don't think my C has gotten that rusty. Please look at the code. 
> Here are significant snippets:
>    buffer = *buffer_ret;
>    field = *buffer_ret;
>    field[field_size] = buffer[buffer_pos];
> That's in-place modification as far as I can tell. In fact, in most 
> cases (no \ escape chars) it actually reads and writes the same byte in 
> the above assignment, so it doesn't actually copy anything.

Yes, it looks like you're right.. it does indeed modify in-place.  In that
case, it should be relatively easy to optimize..  let me think on it.

-- 
 kevin brintnall =~ /kbrint at rufus.net/