[rrd-developers] Re: Improving RRD tool scalability

Mon Mar 3 21:15:46 MET 2003

>  - rrdtool using buffered stdio functions. However there is 
> absolutely no need  for the buffering since the io is random. 
> Also solaris does not support  more then 255 open files using 
> stdio functions.

There is a call to setvbuf to disable stdio buffering. I notice in the
development branch this is commented out (perhaps the 1.0.x branch as
well) I can only guess the reason is that there is was some difficulty
with particular platforms. Anyone have insight here?

> The bottom line is that RRDtool produces a lot of random io 
> and the collection time is bound by disk average seek time 
> multiplied by number of interfaces. Our modification reduced 
> number of seeks by several times but it did not overcome 
> fundamental problem.
> 
> In my opinion further advance in speed will require 
> modification of RRD datastructure.

For the most part, RRDTool acts atomically on individual files. Unlike
database software, it doesn't manage its own cache for fast access to
metadata or recently used data. In theory, it relies on the OS to manage
those things. As any DBA will tell you, I/O is always the bottleneck.
Given 100 Gb of data, likely some tuning would be required for
satisfactory performance even if your solution utilized database
software.

I am in favor of approaches to making RRDTool scale better. One project
I have proposed before it to rewrite the code to use accessor functions
to retrieve data from the header. Not only would this allow flexibility
in the header structure, the header could be managed separately.

Jake Brutlag
Network Analyst
TV Services -- Network Operations
Microsoft MSN 

--
Unsubscribe mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-developers
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi