[rrd-users] Reading thousands of rrds
mike-rrd at bettyscout.org
Sun Jul 11 09:57:23 CEST 2010
On 7/10/10 9:09 AM, Shem Valentine wrote:
> Hello list,
> I have a few thousand rrd's that I need to run a report against. I'll need
> to sum the updload and download given x amount of time. I'll be using
> Python to write this report.
> My biggest concern is the performance hit it may take to run this report.
> I was wondering if anyone has any suggestions as to how they would go about
> Right now I'm considering running the report as a cron job during off peak
> hours and storing the results in a format that would be less intensive to
> Any ideas/suggestions are appreciated,
I've got a python script that runs a weekly report that calculates
overly used servers by pulling in all load, cpu, memory, swap, disk io,
and network usage for ~1200 servers. Probably around 25k data sources.
Now, I run the script from a different machine, reading the rrds over
NFS, so the only potential impact would be disk io, and a little cpu
from the nfs, but I haven't noticed any performance impact.
So.. My advice is just do it. If it's a production style environment,
maybe run it first with nice, or with a few time.sleep(0.01) calls in
You might not see any impact worth worrying about, and if you do, just
work on figuring out how to make the impact acceptable. If the impact
is cpu, export the rrds over nfs and run the script elsewhere. If the
impact is disk io, consider faster disks or a reading the files slower.
More information about the rrd-users