[rrd-developers] Re: How to get the most performance when using lots of RRD files

Hans Jørgen Jakobsen hjj at wheel.dk
Wed Aug 16 21:29:54 MEST 2006



On Wed, 16 Aug 2006, Henrik Stoerner wrote:

> I am using a network/systems monitoring tool - Hobbit - which uses
> lots of RRD files for tracking all sorts of data. This works really
> well - kudos to Tobi.
>
> However, my main system for this currently has about 20.000 RRD files,
> all of which are updated every 5 minutes. So that's about 70 updates
> per second, and I can see that the amount of disk I/O happening on
> this server will become a performance problem soon, as more systems are
> added and hence more RRD files need updating.
>
> This is really not a problem with RRDtool itself, but more a question
> of how to run a system that uses RRDtool extensively. So I am
> considering a couple of  possibilities and would like some comments
> on them.
>

At the company where I work we have a system grown out of
MRTG at 1997. From 1K to +500K interfaces today. The interfaces
are sampled in 1, 3, 5, 10 or 20 min intervals. At the moment
we are collecting 3.4M sets of data each hour.

There has been a number of capacity problem that had to
be solved.

We observed that one update of a rrdfile took 3-400ms but a
second update took near to zero time.

That lead to a design where the poller process(es) spools
the results to files. This is every 5 min spread out to a
file for each router and put in a dir. There are a pool of
rrd-update-processes. A rrd-update-process takes the oldest
datafile and all other files to the same router and sorts
input so all updates to one interface is done at once.
This meens efficiency grows by queue size.

Actually there are 3 seperate queues depending on sample
interval. This is to ensure that the few fast sampled ifs
gets updated faster than the many slow sampled if.
Typical update times are 10min, 15min, +1h.

The number of rrd-updating-processes are larger
than the number of disks to be shure that all disks are
busy always. There are 5*6*2 disks (72G, 15K rpm)
To get the backup done in an resonable time we have to
throttle updates for some hours but that is recovered fast.

/hjj

--
Unsubscribe mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
Archive     http://lists.ee.ethz.ch/rrd-developers
WebAdmin    http://lists.ee.ethz.ch/lsg2.cgi



More information about the rrd-developers mailing list