Hi
I was just checking out possibly new features of rrdtool 1.3.0 and ran
into an issue where a runaway rrdtool process ate all memory on my
machine to the point of completely stalling it. More specifically
my platform is Debian Etch
$ uname -a
Linux schioetz 2.6.18-5-686 #1 SMP Tue Dec 18 21:24:20 UTC 2007 i686 GNU/Linux
with default rrdtool previously installed
$ dpkg -l rrdtool
ii rrdtool 1.2.15-0.3 Time-series data storage and display system
I use munin server to capture some stuff off other machines in my LAN.
Yesterday, I installed rrdtool 1.3.0 in a custom location and then
tried to run graph on one of the munin files. The rrdtool call looked
like this
./bin/rrdtool graph router-net-up.png \
--imgformat=PNG \
--vertical-label="router net up" \
--start=end-1day \
--end=now \
--height=300 \
--width=640 \
--base=1000 \
--lower-limit=0 \
--upper-limit=52000 \
--rigid \
--color=BACK#000000 \
--color=SHADEA#000000 \
--color=SHADEB#000000 \
--color=FONT#DDDDDD \
--color=CANVAS#202020 \
--color=GRID#666666 \
--color=MGRID#AAAAAA \
--color=FRAME#202020 \
--color=ARROW#FFFFFF \
DEF:d=/var/lib/munin/quasi.internal/router.quasi.internal-if_ppp0-up-c.rrd:42:AVERAGE \
DEF:a=/var/lib/munin/quasi.internal/router.quasi.internal-if_eth0-up-c.rrd:42:AVERAGE \
AREA:d#606060:ppp0 \
LINE1:a#EE204D:eth0
This very call has been working ok with the default rrdtool install
all the time, but as said, with 1.3.0 it doesn't. Doing an strace on
it, I see
open("/var/lib/munin/quasi.internal/router.quasi.internal-if_eth0-up-c.rrd", O_RDONLY) = 4
fstat64(4, {st_mode=S_IFREG|0644, st_size=50604, ...}) = 0
fadvise64(4, 0, 0, POSIX_FADV_RANDOM) = 0
mmap2(NULL, 50604, PROT_READ, MAP_PRIVATE|MAP_NORESERVE, 4, 0) = 0xb78c9000
madvise(0xb78c9000, 50604, 0x1 /* MADV_??? */) = 0
madvise(0xb78c9000, 112, MADV_SEQUENTIAL|0x1) = 0
madvise(0xb78c9000, 120, MADV_SEQUENTIAL|0x1) = 0
madvise(0xb78c9000, 1296, MADV_SEQUENTIAL|0x1) = 0
madvise(0xb78c9000, 4, MADV_SEQUENTIAL|0x1) = 0
msync(0xb78c9000, 50604, MS_ASYNC) = 0
munmap(0xb78c9000, 50604) = 0
close(4) = 0
time(NULL) = 1213703114
mmap2(NULL, 1073152, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb77d0000
brk(0x80b1000) = 0x80b1000
brk(0x80d9000) = 0x80d9000
mmap2(NULL, 266240, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb778f000
brk(0x80b8000) = 0x80b8000
mmap2(NULL, 528384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb770e000
munmap(0xb778f000, 266240) = 0
mmap2(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb760d000
munmap(0xb770e000, 528384) = 0
mmap2(NULL, 2101248, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb740c000
munmap(0xb760d000, 1052672) = 0
mmap2(NULL, 4198400, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb700b000
munmap(0xb740c000, 2101248) = 0
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb680a000
munmap(0xb700b000, 4198400) = 0
mmap2(NULL, 16781312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb5809000
munmap(0xb680a000, 8392704) = 0
mmap2(NULL, 33558528, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb3808000
munmap(0xb5809000, 16781312) = 0
mmap2(NULL, 67112960, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xaf807000
munmap(0xb3808000, 33558528) = 0
mmap2(NULL, 134221824, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xa7806000
munmap(0xaf807000, 67112960) = 0
mmap2(NULL, 268439552, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x97805000
munmap(0xa7806000, 134221824) = 0
mmap2(NULL, 536875008, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x77804000
that is, mmap going and allocating memory to powers of 2, doubling
each time.
Finally, an interesting observation. I tried that same call on older
files, and it worked allright. The same applies if I just change the
start time in the above call and on the same file to 9am-1day, while
just 9am gives the said memory problem. Thought this might be of
interest.
Let me know if you'd like me to provide more input.
R.