[rrd-users] Scaling rrd tables for best performance
Eduardo M. Bragatto
eduardo at bragatto.com
Thu Dec 6 16:52:30 CET 2007
Alex van den Bogaerdt wrote:
> when you are going to do your benchmarking, please consider to keep
> an rrdtool process running in 'remote control' mode: "rrdtool -"
> Perhaps it makes a difference.
Wow, I missed that. It seems it will improve performance greatly.
I noticed the remote control feature can be used over a tcp stream to
create a server which sounds like an excellent idea, but how it would
behave if I had two different clients trying to access two different
working directories at the same time?
It doesn't seems that rrdtool would be aware of multiple clients, but
that it would instead expect all commands in a sequential order as if it
came from a single client.
I will have parallel processes polling SNMP data, and it would be great
to have a centralized service where all processes could connect and feed
the data. But if rrdtool can't handle multiple clients that way, I could
just run one instance for each one of the pollers (having the number of
servers exactly as the same as the number of clients).
> Also know that some significant changes were made with respect to
> caching. You really ought to keep an eye on development if memory
> consumption and caching is important to you.
Memory consumption is not a concern in regards to bottleneck issues:
now days it's relatively cheap to have a server with a few gigabytes of
good ram. I'd prefer as much caching as possible to speed up things - as
long as I could control how often data would be synced to the disk in
order to guarantee some minimal lost in case of failure.
So, the question remains: how does rrdtool handles memory and disk I/O
when running as the remote control? Would it write to the disk after
every update or would it cache it and flush it to the disk after a
certain period of time (or after accumulating a certain amount of data)?
Would it keep the DSes in memory so graphing would be faster (no disk
reading)?
I just want to understand how the current stable version works (I don't
care for 1.3 at the moment) so I can scale my setup as best as possible.
For now I'm sure the hardware I have will be enough even for a bad
implementation, but I'm concerned about the future. I don't wanna see
myself a year from now having to change everything because it was poorly
planned.
I understand that some of the new features on 1.3 branch would make
things faster, but I want to work only with what's stable now, even if
it means having less caching than I'd like to see, for example.
Regards,
Eduardo M. Bragatto.
More information about the rrd-users
mailing list