[rrd-developers] rrdcached contention when flushing
kevin brintnall
kbrint at rufus.net
Tue Nov 4 18:12:09 CET 2008
On Tue, Nov 04, 2008 at 04:57:27PM -0000, Daniel.Pocock at barclayscapital.com wrote:
>>> They all become un-stuck at the same time, maybe 20 seconds later, and
>>> then the graphs appear very quickly.
One idea is to try tracing the execution of the rrdcached... see if it's
blocking on any I/O syscalls when the stalls happen.
> I've experimented with sysctl, here are values I'm currently using:
>
> vm.dirty_expire_centisecs = 179971
> vm.dirty_writeback_centisecs = 35993
> vm.dirty_ratio = 90
> vm.dirty_background_ratio = 2
> vm.max_map_count = 4000000
>
> If I understand correctly, then vm.dirty_ratio means nothing should
> block until 90% of the RAM is taken up by dirty pages. Given that
> mmap() is being used with MAP_SHARED, and I have 8GB of RAM, all the
> necessary pages should be staying in RAM. If you can suggest a more
> appropriate strategy for configuring the cache, it would be very
> welcome.
I don't think I can compete with the many tuning resources already out
there. I'm primarily a FreeBSD guy. Here's what I'm using on my one
Linux box.
for disk in sda sdb ; do
## give the scheduler something to work with
echo 512 > /sys/block/$disk/queue/nr_requests
## set read-ahead to 2 file system blocks
blockdev --setra 16 /dev/$disk
done
echo 90000 > /proc/sys/vm/dirty_writeback_centisecs
echo 35 > /proc/sys/vm/dirty_background_ratio
echo 85 > /proc/sys/vm/dirty_ratio
This allows lots of write burst out to RAM... but when it blocks (in
writeback), all IO blocks. It's not optimal.
> There is also a memory leak somewhere (maybe in my striping code, maybe
> in rrdcached). I've tried to start rrdcached with valgrind, but my
> large mmap() call fails with EINVAL when using valgrind.
I've been running rrdcached for weeks with no leaks.. About 30k files in
my test environment, step=300.
> The memory leak could be the cause of the performance issue - it grows
> to several gigabytes and there is swapping, that might be reducing the
> amount of RAM available for caching the mmap() pages. Can you make any
> suggestions for using valgrind or another tool in this scenario?
The leak may be causing other problems.. I'd try running a separate
instance with smaller files and see if you can get valgrind to cooperate.
I haven't found a better tool for tracking down leaks.
--
kevin brintnall =~ /kbrint at rufus.net/
More information about the rrd-developers
mailing list