[rrd-developers] rrdcached performance with >200k nodes

Wed Jan 13 17:21:57 CET 2010

Kevin,

On Wed, Jan 13, 2010 at 4:52 PM, kevin brintnall <kbrint at rufus.net> wrote:
>> >> Hello list,
>> >>
>> >> we've probably reached rrdcached limits in our monitoring system
>> >>
>> >> We had a very nicely running rrdcached while collecting from about 400 hosts,
>> >> about 100k nodes (RRD files).
>> >>
>> >> We've bumped the number of host to about 2000 hosts for interface
>> >> traffic, errors, unicast and multicast packets with collector of our
>> >> own. It does batch the RRD updates using rrdcached's BATCH via unix
>> >> socket. This collector is able to walk
>> >> all the hosts in less than 5 minutes. The number of nodes is about 200k.
>> >>
>> >> The rrdcached is configured to -w 3600 -z 3600 -f 7200 -t 8. Everything runs
>> >> smoothly until first timeout. Then the Queue value rises up to the
>> >> number of nodes
>> >> and keeps that high. Write rate is very low, disk IO is almost zero.
>> >> CPU load done by rrdcached gets very high (100-200%).
>> >>
>> >> The system is FreeBSD 7.2-p4, amd64 with 16GB RAM, RAID10 disk array.
>> >> rrdtool 1.4.2.
>> >>
>> >> Could it be we've reached rrdcached's limits? What can be done about it?
>
> Hi Mirek,
>
> I'm running a very similar setup to yours: FreeBSD 7/amd64, ~270k nodes, 5
> minute interval.  I am using '-w 21600 -z 21600 -f 86400', and my
> rrdcached is steady at ~1.5G RSS.
>
> Ideally you would cache at least one full page of writes per RRD file.
> So, your ideal "-w" timer would be at least:
>
>        (RRD step interval)*(page size)/(RRD row size).
>
> I'm guessing at least part of your problem is IO limitations.  As Florian
> said, this workload will see most of the disk's time used up seeking,
> rather than writing. (try watching "gstat").
>
> As for the CPU, it's possible we have some problem that only exhibits
> itself when there is a large queue.  However, I've never run into this.
> We'll nave to narrow the problem down a little more.
>
> When it's exhibiting this high CPU problem, does it continue to write to
> the journal?  Are there an abnormal number of "FLUSH" or "WROTE" entries
> at that time?
>
> What do you mean by "until the first timeout"?
>
> P.S. I also use these sysctl values, FWIW, YMMV:
>
> vfs.ufs.dirhash_maxmem=16777216 # from 2097152
> vfs.hirunningspace=4194304      # from 1048576
>
> --
>  kevin brintnall =~ /kbrint at rufus.net/
>

there is about 224k nodes in the tree, after issuing FLUSHALL
it takes about 20 minuts (with -t 8) to write almost all nodes

we're now stuck at approx 10k nodes in queue,
journal continues to write, updates are received at normal rate

gstat:

# gstat -b
dT: 1.000s  w: 1.000s
 L(q)  ops/s    r/s   kBps   ms/r    w/s   kBps   ms/w   %busy Name
    0      0      0      0    0.0      0      0    0.0    0.0  acd0
    0      3      0      0    0.0      3     48    5.0    1.5  aacd0
    0      3      0      0    0.0      3     48    5.1    1.5  aacd0s1
    0      3      0      0    0.0      3     48    5.1    1.5  aacd0s1a
    0      0      0      0    0.0      0      0    0.0    0.0  aacd0s1b
    0      0      0      0    0.0      0      0    0.0    0.0  aacd0s1c

top:

last pid: 28506;  load averages:  9.93,  6.60,  4.37

                              up 50+19:00:19  17:20:43
271 processes: 9 running, 243 sleeping, 19 zombie
CPU 0:     % user,     % nice,     % system,     % interrupt,     % idle
CPU 1:     % user,     % nice,     % system,     % interrupt,     % idle
CPU 2:     % user,     % nice,     % system,     % interrupt,     % idle
CPU 3:     % user,     % nice,     % system,     % interrupt,     % idle
Mem: 1121M Active, 14G Inact, 805M Wired, 320M Cache, 399M Buf, 490M Free
Swap: 8192M Total, 100K Used, 8192M Free

  PID USERNAME  THR PRI NICE   SIZE    RES STATE  C   TIME    CPU COMMAND
91510 portax     60  44    0   133M 90052K select 3   0:00 102.10% rrdcached

Regards,
Mirek