[rrd-developers] Re: How to get the most performance when using lots of RRD files
Richard A Steenbergen
ras at e-gerbil.net
Wed Aug 16 13:19:01 MEST 2006
On Wed, Aug 16, 2006 at 08:10:09AM +0200, Henrik Stoerner wrote:
> I am using a network/systems monitoring tool - Hobbit - which uses
> lots of RRD files for tracking all sorts of data. This works really
> well - kudos to Tobi.
>
> However, my main system for this currently has about 20.000 RRD files,
> all of which are updated every 5 minutes. So that's about 70 updates
> per second, and I can see that the amount of disk I/O happening on
> this server will become a performance problem soon, as more systems are
> added and hence more RRD files need updating.
I've been in a similar situation myself, doing 20-30 sec updates on 50k+
RRD files. The bottom line is that rrdtool is just not designed to do
that, and it will go kicking and screaming into the night when you try to
make it. The "typical user" is calling the rrdtool binary from a perl
script, graphing a few dozen or at worst hundreds of items, and doesn't
have a care in the world about the internal architecture.
The situation I was trying to solve involved a constant stream of high
resolution data across a large set of records, and relatively infrequent
viewing of that data. It sounds like you're trying to do something
similar. Honestly if all you care about is databasing it would probably be
easier to ditch RRD and use something else or write your own db which is
more efficient, but at the end of the day (for me anyways :P) rrdtool does
the best job of producing pretty pictures that don't look like they came
off of gnuplot or my EKG, and I'm in no mood to become a graphics person
and re-invent the wheel.
So, probably your biggest issue is indeed thrashing the hell out of the
disk if you just tried to naively fire off a pile of forks and hope it all
works out for the best. In my application I implemented a data write queue
and a single thread per disk for dispatching rrd updates, which helps
quite a bit. It really depends on your polling application as to how easy
this is though.
Obviously a syscall to exec a shell to run the rrdtool binary every time
scales to about nothing, and the API (if you can even call it that, I
don't think (argc, argv) counts :P) to rrdtool functions in C really and
truly bites. If your application is in C, and you can link directly to the
librrd, thats a quick and dirty fix for at least some of the evils. What
really should happen is for that entire section of code to be gutted with
a vengence, split the text parsing code out of it and send it in the
direction of the cli frontend, and develop an actual API for passing in
data in a sensible format for other users who want to link to a C lib.
This really isn't that difficult to do either.
The big daddy of performance suck is then going to be, opening, closing,
and seeking the right spot in the files every time. Again, perfectly
straight forward for very light scripty use, but using .rrd files as an
indexing method for large datasets scales horribly. One thing you could do
if you really wanted to scale this db format (since the updated are
relatively simple compared to the graphing) is to write your own code to
keep open handles on the files and do your own direct db access. This
would be fairly effective up to a point, obviously there is a limit to the
number of files you can keep open on your OS, but by the point you reach
it you've probably crossed that threshold to where looking at a different
solution to replace rrd completely is worth your time again. Of course,
also make sure that your polling app isn't completely braindead, because
you can do plenty of intelligent aggregation of datasources inside a
single .rrd file.
One option I explored for doing 10 sec updates was to keep my .rrd files
in a ram disk, and periodically sync to disk at intervals where you want
to save long term data (say for example 5 minutes, so you only lose 5 mins
of data in the event of a failure). Of course the problem I ran into is
that in addition to doing very high resolution short term data collection
(it makes for really nice graphs of realtime data, honest :P), I'm storing
a fair amount of long term data too. This means that it is perfectly
reasonable for a .rrd file to be large (say 500KB-1MB), but for only a few
KB of the data per file to actually be touched on any given update
interval. What you'd really be looking for out of a ram disk there is
file/disk-backed storage and a really slow periodic flush of dirty blocks
to disk, which is again probably more work then you should put into a hack
around rrdtool. Of course if you can afford the ram in the first place to
make all your data fit, you can just dd a raw image at the block level and
get much less disk thrashing than accessing tens of thousands of small
files.
Or hell you could always just throw more spindles at it or throw a few
more $500 linux PCs at it, what do I care. :)
--
Richard A Steenbergen <ras at e-gerbil.net> http://www.e-gerbil.net/ras
GPG Key ID: 0xF8B12CBC (7535 7F59 8204 ED1F CC1C 53AF 4C41 5ECA F8B1 2CBC)
--
Unsubscribe mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
Archive http://lists.ee.ethz.ch/rrd-developers
WebAdmin http://lists.ee.ethz.ch/lsg2.cgi
More information about the rrd-developers
mailing list