[rrd-developers] Re: How to get the most performance when using lots of RRD files

Fri Aug 18 00:44:46 MEST 2006

On Thursday, August 17, Richard A Steenbergen wrote:
> On Wed, Aug 16, 2006 at 04:30:03PM +0200, Tobias Oetiker wrote:
> > 
> > a balanced performance (read AND wrtie) without the need to
> > periodiacally clean up the mess, was the prime idea behind the
> > rrdtool datastructure. so if you know the holy grail on how to do
> > this 'better' I would be most interested ...
> 
> Understood, but the tradeoff is that while the current design is very 
> simple and effective for lightweight use, it is very inefficient for large 
> scale use.

Yo all,

I'll try to wade in a bit here.  Not sure if I have a solution of sorts,
but I'll make some suggestions.  Please correct me if I'm wrong, it's
been some time since I've looked at the RRD datamodel/layout.  The
problem as I see it is basically a header with pointers to the start/end
of the round- robin data present in a fixed buffer.

The problem I see is two-fold:

  - You are opening/closing each file, which is not that "fast" an operation
	when you're talking 50k files.
  - You are locking each file to make sure things are done in an "atomic"
	fashion of sorts.

The first of these can be mitigated (and has been it seems) by keeping
more RRDs inside one file.  Larger files, less opening/closing of files.
It also mitigates the second to a degree, if the locking is done once,
and all updates are done in a smart fashion.

However, none of this stuff will stream the data.  The disk is basically
at the mercy of the operating system disk cache to give it data in a
fashion that it can write in a sequential manner.  Of course, once
you solve this portion of the puzzle (always/mostly writing in a
sequential/streaming manner), along comes someone else that wants to
generate something else (graphics) from the data, and causes your disks
to seek again (this time read/write contention).

In order to "solve" this problem, I would look at a couple
possibilities.  I'd explore to see how much of the data could be kept
in memory.  This so that the output-generating (graphics/etc) process
accesses the data from memory, and not have to go to disk and interrupt
the streaming of data to disk (reduce seek operations).  I'd also
see about implementing some sort of write-behind operations for the
logging/updating portion.  The tradeoff is keeping/loosing some portions
if the machine crashes, and batching the updates to disk.  These two may
entail an RRD db server process, something that you can access over the
network/pipe/etc.  Once you have removed the client access from direct
access to the disk, more powerfull RRD data formats can be implemented.
(Some of these could be done with client stuff as well).

In some ways, I'd explore the use of a directory as an RRD database.  At
this point you may be able to remove the locking requirement, using
unique filenames, and atomic filesystem operations (rename, link,
etc) to keep the atomicity of operations, but allowing the filesystem
to cache things better overall.  The cleanup operation could well be
implemented in the rrd_open() function (or within the RRD db server
process).  I'd make it optional, for people that are really pushing the
envelope, where they could call rrd_cleanup() at a later date if they
wish/choose to do so.

Not sure if this makes any sense... lack of sleep has been somewhat
chronic of late.

--Toby.  (no, not that toby)

--
Unsubscribe mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
Archive     http://lists.ee.ethz.ch/rrd-developers
WebAdmin    http://lists.ee.ethz.ch/lsg2.cgi