[rrd-developers] rrd cache work

Thu Sep 4 19:56:13 CEST 2008

Hi,

Great to see this work on rrdtool, looks like quite a bit is getting
done.

On Tue, Sep 02, 2008 at 11:15:14AM -0500, kevin brintnall wrote:
> On Tue, Sep 02, 2008 at 11:33:45AM +0200, Florian Forster wrote:
> 
> > > When creating new cache_item_t in handle_request_update,
> > > we should skew the time as follows:
> > > 
> > >   ci->last_flush_time = now + random() % config_flush_interval;
> > 
> > I see your point, but I think a much more effective way of avoiding IO
> > problems is by throttling the speed in which RRD files are written. If
> > you set this to, say, 20 updates per second, your system will stay
> > responsive and all data will and up on permanent storage eventually.
> > `Flush'ed values ignore this speed-limit, of course.. I've implemented
> > this for the `rrdtool' plugin in `collectd' and it works like a charm :)
> 
> I see your point.  There are problems with the throttling approach on
> either extreme.
> 
>  * if the rate is too high, it becomes the same problem that we have now
> 
>  * if the rate is too low, then client applications may block for a long
>    time waiting for flushes, while the hardware is idle (think graph with
>    a lot of RRDs).
> 

In our experience on linux centos 4 and 5, rrdtool performance
problems seem primarily related to the rate at which the os flushes
the rrd updated pages in the filesystem cache to disk.  Not so much to
the actual rate that the rrd_update function is called.

There is a tip on the rrdtool web site under VM optimizations,

  By setting dirty_expire_centisecs to a high value (several steps),
  while all rrd data fits into the cache, will cause your system to
  bundle up several rounds of updates before writing the dirty buffers
  back to disk.

  http://oss.oetiker.ch/rrdtool-trac/wiki/TuningRRD

Wondering if any thought has been given to addressing the update
bottleneck problem at a lower level?

Thanks

Scott B