[rrd-users] Effects of RRAs on CPU, disk I/O, and memory resources

Fri Sep 24 20:44:03 CEST 2010

My organization wants to record more detailed information for longer 
periods of time without consuming excessive resources on the server 
performing RRD operations and storing RRD files.  I know RRD files 
maintain the same size on disk after creation.  However, I'm unsure how 
the number of RRAs and RRA step/row values affect CPU, disk I/O, and 
memory resources particularly during update operations and periodic 
internal consolidations.  Further complicating the situation, each 
graphing application manages RRD files differently.  For example, Zenoss 
creates one RRD file with a single DS and associated RRAs for each OID 
retrieved from a device.  Retrieving just a few OIDs from a large number 
of devices results in Zenoss operating on many small RRD files per 
update cycle.  I haven't found a way to configure it otherwise.  On the 
other hand, Cacti creates one RRD file with a variable number of DS 
based on templates.  I can configure and implement these templates to 
perform roughly the opposite management style of Zenoss by increasing 
the size and complexity of each RRD file, but reducing the overall 
number of them.  I know discussion of Zenoss vs. Cacti wanders 
off-topic, but in this case providing the methods they use for managing 
RRD files may greatly affect answers to my primary question.

I currently use Zenoss to retrieve about 11000 OIDs every 5 minutes. 
Each of the 11000 RRD files contains 4 RRAs providing data for about 2 
days at 5 minute intervals (essentially no consolidation), 2 weeks at 30 
minute intervals, 50 days at 2 hour intervals, and 600 days at 1 day 
intervals.

What sort of effects on CPU, disk I/O, and memory resources can I 
anticipate during update and periodic internal consolidations if, for 
example, I change these RRAs to 1 year at 5 minute intervals 
(essentially no consolidation), 2 years at 1 hour intervals, and 3 years 
at 1 day intervals?  Can anyone provide a formula or general way to 
estimate resource consumption for various RRA configurations?  When 
confronting resource issues, which forms of RRD tuning (e.g., increasing 
RRA steps, reducing RRA rows, or even eliminating RRAs) provide the most 
effect?  I'm also somewhat interested in determining how various RRA 
configurations might affect resources needed for graph operations.

Thanks,
Matt