[rrd-developers] rrdcached contention when flushing

kevin brintnall kbrint at rufus.net
Tue Nov 4 23:01:00 CET 2008


On Tue, Nov 04, 2008 at 08:44:22PM -0000, Daniel.Pocock at barclayscapital.com wrote:
> I've shrunk it down to monitor just one host, that successfully avoids
> the mmap() issue with valgrind.  The first thing that stands out is
> this:
> 
> ==9115== 15,872 bytes in 32 blocks are possibly lost in loss record 10
> of 13
> ==9115==    at 0x4A04A1D: memalign (vg_replace_malloc.c:332)
> ==9115==    by 0x4A04A76: posix_memalign (vg_replace_malloc.c:425)
> ==9115==    by 0x3314041F50: (within /lib64/libglib-2.0.so.0.1200.3)
> ==9115==    by 0x331404326F: g_slice_alloc (in
> /lib64/libglib-2.0.so.0.1200.3)
> ==9115==    by 0x331404BFFD: (within /lib64/libglib-2.0.so.0.1200.3)
> ==9115==    by 0x331404C221: (within /lib64/libglib-2.0.so.0.1200.3)
> ==9115==    by 0x404945: handle_request_update (rrd_daemon.c:1456)
> ==9115==    by 0x404F2E: handle_request (rrd_daemon.c:1633)
> ==9115==    by 0x405A3A: connection_thread_main (rrd_daemon.c:1987)
> ==9115==    by 0x33134062F6: start_thread (in /lib64/libpthread-2.5.so)
> ==9115==    by 0x33120CE85C: clone (in /lib64/libc-2.5.so)

Dan,

When files are flushed out to disk, their tree node remain.  That way, the
structure of the tree doesn't have to be re-balanced over and over for a
static working set.

The nodes are removed from the tree only when old un-modified nodes are
evicted from the tree.  This happens every flush (-f timer).

> In rrd_daemon.c, I found the tree initialised like this:
> 
>   cache_tree = g_tree_nw((GCompareFunc)strcmp);
> 
> There is an alternate form of g_tree_new() that allows you to specify
> functions for freeing the memory used by keys - do you think this could
> be an issue?  Are the keys meant to be re-used as new values come in, or
> do they need to be completely freed and then re-created when new values
> arrive for a given file?

When the nodes are removed, their data is freed with forget_file().  This
frees the cache_item_t.file pointer, which is also used for the gtree key.

The tree is not freed on exit..  That's probably the cause of a few of the
loss records.  I'll work on a patch to destroy the cache data structure
and a few dangling mallocs on shutdown.

That sill should not cause your process to bloat to several gigabytes..
The size of the tree should be on the order of O(node_count * PATH_MAX).

Try writing a single process that writes directly to many files using
rrd_update() without going through the daemon.  If that leaks, it's not in
rrdcached.

What does "STATS" from your daemon look like?

-- 
 kevin brintnall =~ /kbrint at rufus.net/



More information about the rrd-developers mailing list