[rrd-developers] interleaving RRD files, RRDfs perhaps?

Scott Brumbaugh scottb at prolexic.com
Fri Oct 3 00:17:25 CEST 2008


Hi Daniel,

I think the suggestion to put more data sources into the files will
help.  There is another technique that you might experiment if you are
using linux.  We find it helpful on our installation.

Try this tip on vm optimization the rrdtool wiki

 http://oss.oetiker.ch/rrdtool-trac/wiki/TuningRRD

 dirty_expire_centisecs
  The pdflush process will start writing data to disk that has been
  dirty for more than the given amount of 1/10 of a second.

Set it to some multiple of your polling rate, if that is 1 minute set
it to 60000 for a 10 minute dirty filesystem page expire time.  You
would need to experiment for your system.  This would allow the
updating rrd file blocks to stay in memory through 10 updates.  The
default setting of this value is 3000 or 30 sec.  You will probably
still see a lot of 4k writes but now each write will be updating 10
rra cells.

As a side affect you will still have a bursty io writes when the
pdflushd threads start writing to disk after the 10 minutes.  To avoid
this we run an external process that walks the rrd filesystem tree
continually fsyncing all the rrd files at a controlled rate.  We find
that timing the walk to complete 2 times in the 10 minutes seems to
work ok.  What this does is it smooths out the io load to the disks so
that we aren't writing all the data at the a same time but slowly over
an extended period.  Since the updates are in filesystem cache you can
still read them out efficiently also.


Scott B



On Thu, Oct 02, 2008 at 09:06:35PM +0200, Tobias Oetiker wrote:
> Hi Daniel,
> 
> this has not been discussed yet. I guess the problem is that you
> get into a region where there are not all that many people who have
> installations so large that it matters.
> 
> Normally the filesystem should be able to take care of quite a lot
> of otimization duties as long as you present it with a siutable
> workload. So maybe there is some low hanging fruit to be harvested
> by looking at when and in what order rrdcached does its updates and
> also maybe how they are performed by the rrd update code, without
> gowing to a raw device ...
> 
> I am pretty sure though that you can esily devise a much more
> performant system if you start limiting the way data can be
> deliverd (eg all rrds look the same)
> 
> Looking at the existing code, did you investigate creating rrd
> files with a LOT of DSes ?
> 
> cheers
> tobi
> 
> 
> 
> ? then Today Daniel.Pocock at barclayscapital.com wrote:
> 
> >
> >
> >
> > With a system such as Ganglia, there are many RRD files all created with
> > exactly the same parameters.
> >
> > While rrdcached provides benefits by caching updates to each file, we
> > still see lots of small 4kb random access writes to the SAN.
> >
> > Has anyone considered the possibility of aggregating multiple RRD files
> > with identical RRAs into a single file, or even some kind of
> > pseudo-filesystem?  This would mean that each disk block would contain
> > data from many RRDs common to a specific update time.  There would be a
> > dramatic improvement in write speed, as the OS and SAN would be caching
> > the blocks relating to the current time period, rather than racing back
> > and forth all over the disk.
> >
> > It would be particularly desirable to be able to tune the block size
> > used for interleaving, so that it could maximise filesystem and SAN
> > performance.
> >
> > Regards,
> >
> > Daniel
> > _______________________________________________
> >
> > This e-mail may contain information that is confidential, privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. Unless specifically indicated, this e-mail is not an offer to buy or sell or a solicitation to buy or sell any securities, investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Barclays. Any views or opinions presented are solely those of the author and do not necessarily represent those of Barclays. This e-mail is subject to terms available at the following link: www.barcap.com/emaildisclaimer. By messaging with Barclays you consent to the foregoing.  Barclays Capital is the investment banking division of Barclays Bank PLC, a company registered in England (number 1026167) with its registered off
>  ic
> >  e at 1 Churchill Place, London, E14 5HP.  This email may relate to or be sent from other members of the Barclays Group.
> > _______________________________________________
> >
> > _______________________________________________
> > rrd-developers mailing list
> > rrd-developers at lists.oetiker.ch
> > https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers
> >
> >
> 
> -- 
> Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
> http://it.oetiker.ch tobi at oetiker.ch ++41 62 775 9902 / sb: -9900
> 
> _______________________________________________
> rrd-developers mailing list
> rrd-developers at lists.oetiker.ch
> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers
> 



More information about the rrd-developers mailing list