[rrd-users] to many "data agregation" transactions at the same time

Alex van den Bogaerdt alex at vandenbogaerdt.nl
Thu Dec 15 21:42:43 CET 2011


> However, as this runs on crontab, we have now end up with a situation
> where a lot of .rrd files (thousands) where created at the same time of
> the day. The result is that the "data aggregation" process of all these
> .rrd files start at roughly the same time; and this causes disk IO
> congestion.

Others have already explained this is not the cause of the problem.

Each RRA needs to be updated at time "n" (an integer) times "duration of each RRA interval".
So if you have RRAs which store 30 seconds, 1 minute, 5 minutes, 15 minutes, 1 hour, 2 hours, 6 hours, 1 day then all eight will need to be updated at midnight UTC.

> What would be the best way to deal with this?

Collect your input, save them together with a timestamp, and then graduately update your RRDs with that data. This will smear out the peaks in your IO. The only problem with this is that you do not have the consolidated data available until your update has been done on a particular RRD.

Instead of updating all your thousands of RRDs as quickly as possible, you pause a few seconds inbetween them. Because you update with timestamps  (e.g.   rrdtool update file.rrd 1323993600:12345  instead of rrdtool update file.rrd N:12345) you don't loose any accuracy.

HTH
Alex



More information about the rrd-users mailing list