[rrd-developers] [rrd] Re: rrdcached performance with >200k nodes

Mirek Lauš mirek.laus at gmail.com
Wed Jan 13 11:08:44 CET 2010


$ callgrind_annotate callgrind.out.72526
--------------------------------------------------------------------------------
Profile data file 'callgrind.out.72526' (creator: callgrind-3.5.0)
--------------------------------------------------------------------------------
I1 cache:
D1 cache:
L2 cache:
Timerange: Basic block 0 - 15555346483
Trigger: Program termination
Profiled target:  /usr/local/bin/rrdcached -l unix:/tmp/rrdcached.sock
-w 3600 -z 3600 -t 2 -f 7200 -m 64 -j /var/rrd/journal -p
/tmp/rrdcached.pid (PID 72526, part 1)
Events recorded:  Ir
Events shown:     Ir
Event sort order: Ir
Thresholds:       99
Include dirs:
User annotated:
Auto-annotation:  off

--------------------------------------------------------------------------------
            Ir
--------------------------------------------------------------------------------
56,987,676,075  PROGRAM TOTALS

--------------------------------------------------------------------------------
            Ir  file:function
--------------------------------------------------------------------------------
31,901,245,144  ???:strcmp'2 [/lib/libc.so.7]
 7,011,268,017  ???:0x0000000000402ac0 [/usr/local/bin/rrdcached]
 2,860,147,538  ???:memchr [/lib/libc.so.7]
 1,536,256,939  ???:strtol [/lib/libc.so.7]
 1,347,413,568  ???:strcmp [/lib/libc.so.7]
 1,189,752,119  ???:0x000000000005e6c0 [/usr/local/lib/libglib-2.0.so.0]
   916,274,439  ???:0x00000000004054b0 [/usr/local/bin/rrdcached]
   838,626,751  ???:0x00000000000d6570 [/lib/libc.so.7]
   837,387,282  ???:strcasecmp [/lib/libc.so.7]
   750,265,924  ???:0x0000000000012d90 [/usr/local/lib/librrd_th.so.5]
   650,488,788  ???:0x00000000004044d0 [/usr/local/bin/rrdcached]
   617,364,221  ???:malloc [/lib/libc.so.7]
   604,145,186  ???:pthread_mutex_unlock [/lib/libthr.so.3]
   592,636,144  ???:strlen [/lib/libc.so.7]
   477,901,748  ???:write [/lib/libthr.so.3]
   475,805,061  ???:strncpy [/lib/libc.so.7]
   457,089,329  ???:pthread_mutex_lock [/lib/libthr.so.3]
   423,052,821  ???:0x00000000000e1c00 [/lib/libc.so.7]
   343,900,593  ???:memcpy [/lib/libc.so.7]
   287,810,433  ???:fgets [/lib/libc.so.7]
   266,462,145  ???:0x0000000000404a10 [/usr/local/bin/rrdcached]
   209,580,024  ???:0x0000000000010ad0 [/lib/libthr.so.3]
   208,957,160  ???:rrd_add_strdup_chunk [/usr/local/lib/librrd_th.so.5]
   182,150,140  ???:0x0000000000402cb0 [/usr/local/bin/rrdcached]
   169,961,388  ???:rrd_add_ptr_chunk [/usr/local/lib/librrd_th.so.5]
   152,881,757  ???:__error [/lib/libthr.so.3]
   152,119,140  ???:rrd_write [/usr/local/lib/librrd_th.so.5]
   125,430,431  ???:strdup [/lib/libc.so.7]
   120,542,093  ???:0x00000000004032d0 [/usr/local/bin/rrdcached]
    99,270,079  ???:0x0000000000403730 [/usr/local/bin/rrdcached]
    94,045,338  ???:0x0000000000403160 [/usr/local/bin/rrdcached]
    91,676,351  ???:memset [/lib/libc.so.7]
    87,324,745  ???:0x00000000000105c0 [/lib/libthr.so.3]
    81,093,490  ???:g_tree_replace [/usr/local/lib/libglib-2.0.so.0]
    67,444,844  ???:0x0000000000405d50 [/usr/local/bin/rrdcached]
    59,562,102  ???:g_tree_lookup [/usr/local/lib/libglib-2.0.so.0]
    50,205,190  ???:0x00000000000d6540 [/lib/libc.so.7]
    37,053,112  ???:pthread_mutex_trylock [/lib/libthr.so.3]
    34,912,560  ???:vsnprintf [/lib/libc.so.7]
    34,700,436  ???:__tls_get_addr [/libexec/ld-elf.so.1]


On Wed, Jan 13, 2010 at 10:35 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:
> Hi Mirek,
>
> Today Mirek Lauš wrote:
>
>> Tobi,
>>
>> what do you recommend for profiling on FreeBSD? I'm not very keen on that.
>
> I am unfortunately not familiar with appropriate tools on freebsd I
> have been pretty successful using callgrind on linux ... don't know
> how well the freebsd version works though ...
> http://www.freebsd.org/cgi/ports.cgi?query=valgrind
>
> cheers
> tobi
>>
>> King regards,
>> Mirek
>>
>> On Wed, Jan 13, 2010 at 9:48 AM, Tobias Oetiker <tobi at oetiker.ch> wrote:
>> > Hi Mirek,
>> >
>> > Today Mirek Lauš wrote:
>> >
>> >> Hello list,
>> >>
>> >> we've probably reached rrdcached limits in our monitoring system
>> >>
>> >> We had a very nicely running rrdcached while collecting from about 400 hosts,
>> >> about 100k nodes (RRD files).
>> >>
>> >> We've bumped the number of host to about 2000 hosts for interface
>> >> traffic, errors, unicast and multicast packets with collector of our
>> >> own. It does batch the RRD updates using rrdcached's BATCH via unix
>> >> socket. This collector is able to walk
>> >> all the hosts in less than 5 minutes. The number of nodes is about 200k.
>> >>
>> >> The rrdcached is configured to -w 3600 -z 3600 -f 7200 -t 8. Everything runs
>> >> smoothly until first timeout. Then the Queue value rises up to the
>> >> number of nodes
>> >> and keeps that high. Write rate is very low, disk IO is almost zero.
>> >> CPU load done by rrdcached gets very high (100-200%).
>> >>
>> >> The system is FreeBSD 7.2-p4, amd64 with 16GB RAM, RAID10 disk array.
>> >> rrdtool 1.4.2.
>> >>
>> >> Could it be we've reached rrdcached's limits? What can be done about it?
>> >
>> > I am not running a huge rrdcached setup myself, but what I gether
>> > from other posts is that this is NOT the limit, there must be other
>> > issues at play. Can you do a profiling run to identify the hotspot
>> > in rrdcached ? The limit is reached when the system becomes Disk-IO
>> > bound, if it becomes CPU bound, then there is a bug somewhere.
>> >
>> > cheers
>> > tobi
>> >
>> > ps. I am moving this to rrd-developers
>> >
>> > --
>> > Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
>> > http://it.oetiker.ch tobi at oetiker.ch ++41 62 775 9902 / sb: -9900
>>
>>
>
> --
> Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
> http://it.oetiker.ch tobi at oetiker.ch ++41 62 775 9902 / sb: -9900



More information about the rrd-developers mailing list