[rrd-developers] Re: Improving RRD tool scalability
Jake Brutlag
jakeb at microsoft.com
Mon Mar 3 21:15:46 MET 2003
> - rrdtool using buffered stdio functions. However there is
> absolutely no need for the buffering since the io is random.
> Also solaris does not support more then 255 open files using
> stdio functions.
There is a call to setvbuf to disable stdio buffering. I notice in the
development branch this is commented out (perhaps the 1.0.x branch as
well) I can only guess the reason is that there is was some difficulty
with particular platforms. Anyone have insight here?
> The bottom line is that RRDtool produces a lot of random io
> and the collection time is bound by disk average seek time
> multiplied by number of interfaces. Our modification reduced
> number of seeks by several times but it did not overcome
> fundamental problem.
>
> In my opinion further advance in speed will require
> modification of RRD datastructure.
For the most part, RRDTool acts atomically on individual files. Unlike
database software, it doesn't manage its own cache for fast access to
metadata or recently used data. In theory, it relies on the OS to manage
those things. As any DBA will tell you, I/O is always the bottleneck.
Given 100 Gb of data, likely some tuning would be required for
satisfactory performance even if your solution utilized database
software.
I am in favor of approaches to making RRDTool scale better. One project
I have proposed before it to rewrite the code to use accessor functions
to retrieve data from the header. Not only would this allow flexibility
in the header structure, the header could be managed separately.
Jake Brutlag
Network Analyst
TV Services -- Network Operations
Microsoft MSN
--
Unsubscribe mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
Archive http://www.ee.ethz.ch/~slist/rrd-developers
WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
More information about the rrd-developers
mailing list