[rrd-users] rrdtool --lazy option seems to be not working

Nugin, Paavo Paavo.Nugin at krediidiinfo.ee
Wed Nov 27 14:40:44 CET 2013


I've been trying for 3 days to migrate our Munin server from 32bit to 64bit instance (both are VM-s with pretty much same settings and OS - SLES11).
Problem is that it simply does not perform - it can't update all PNG files in 5 minute timeframe (was perfectly OK in 32bit instance).
I've traced it down to rrdtool and it's lazy option.

I've made a simple test.sh that I took from output of munin debug, it's about 37K and it consists of one big "rrdtool graph" command, which generates PNG of yearly irqstats graph of one server (because seems that irqstats graphs have the most source data and take the most time to calculate)

When I delete yearly PNG file in the OLD (32bit) server and run test.sh, then first run takes pretty long time but all the consecutive times it takes only ~0.06s
When I delete yearly PNG file in the NEW (64bit) server and run test.sh, then first run takes ~1.2s but all the consecutive times it takes 0.6-0.8s, which is more than 10 times longer than in the old server! Timestamp of the PNG file remains the same each time, its does not get updated.

I run strace on both servers to see what's going on, here are the excerpts of where it starts to differ:
Old server:
[pid 13016] mremap(0xb6a7f000, 1236992, 1241088, MREMAP_MAYMOVE) = 0xb6a7f000
[pid 13016] stat64("/srv/www/vhosts/munin.sise/kisise/myyr.kisise/irqstats-year.png", {st_mode=S_IFREG|0644, st_size=82285, ...}) = 0
[pid 13016] time(NULL)                  = 1385553085
[pid 13016] open("/srv/www/vhosts/munin.sise/kisise/myyr.kisise/irqstats-year.png", O_RDONLY) = 3

After that it pretty much finishes (doesn't touch any .rrd file).

Same location in new server:
[pid 28160] mremap(0x7f2d188ed000, 1302528, 1306624, MREMAP_MAYMOVE) = 0x7f2d188ed000
[pid 28160] open("/usr/lib/locale/\2/LC_NUMERIC", O_RDONLY) = -1 ENOENT (No such file or directory)
[pid 28160] stat("/srv/www/vhosts/munin.sise/kisise/myyr.kisise/irqstats-year.png", {st_mode=S_IFREG|0644, st_size=43415, ...}) = 0
[pid 28160] open("/srv/www/vhosts/munin.sise/kisise/myyr.kisise/irqstats-year.png", O_RDONLY) = 3
[pid 28160] fstat(3, {st_mode=S_IFREG|0644, st_size=43415, ...}) = 0
[pid 28160] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f2d1cab5000
[pid 28160] read(3, "\211PNG\r\n\32\n\0\0\0\rIHDR\0\0\1\361\0\0\4\253\10\6\0\0\0\202\37\201"..., 4096) = 4096
[pid 28160] close(3)                    = 0
[pid 28160] munmap(0x7f2d1cab5000, 4096) = 0
[pid 28160] open("/mnt/data/munin/kisise/myyr.kisise-irqstats-iMIS-d.rrd", O_RDONLY) = 3
[pid 28160] fstat(3, {st_mode=S_IFREG|0644, st_size=2765800, ...}) = 0
[pid 28160] fadvise64(3, 0, 0, POSIX_FADV_RANDOM) = 0

And after that it starts to scan every single (about 100 of them) irqstats .rrd file of that server (which is the reason that it takes so much time).

Seems that on older server it notices (like it should?) that yearly PNG file is fresh enough and does not try to recreate it but newer server stupidly tries to recalculate it?

test.sh in both servers are exactly the same, here's the head:
rrdtool 'graph' '--font'  'DEFAULT:0:DejaVuSans'   '--font'   'LEGEND:7:DejaVuSansMono'   '-W'   'Munin 1.4.6' /srv/www/vhosts/munin.sise/kisise/myyr.kisise/irqstats-year.png'  '--title' 'Individual interrupts - by year'   '--start'  '-400d'  '--base'    '1000'  '-l'  '0;'  '--vertical-label' 'interrupts / second'  '--height'  '250'   '--width'   '400'   '--imgformat'    'PNG'    '--lazy' \ yada-yada-yada...

rrdtool version in both servers is 1.3.7 (was 1.4.7 in new server but I compiled and installed older one to make sure it's not because of that).


Best regards,
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.oetiker.ch/pipermail/rrd-users/attachments/20131127/fae8d9de/attachment-0001.htm 

More information about the rrd-users mailing list