[rrd-developers] rrd cache work
Scott Brumbaugh
scottb at prolexic.com
Thu Sep 4 23:46:03 CEST 2008
Hi Tobi,
On Thu, Sep 04, 2008 at 09:57:02PM +0200, Tobias Oetiker wrote:
> Hi Scott,
>
> > In our experience on linux centos 4 and 5, rrdtool performance
> > problems seem primarily related to the rate at which the os flushes
> > the rrd updated pages in the filesystem cache to disk. Not so much to
> > the actual rate that the rrd_update function is called.
> >
> > There is a tip on the rrdtool web site under VM optimizations,
> >
> > By setting dirty_expire_centisecs to a high value (several steps),
> > while all rrd data fits into the cache, will cause your system to
> > bundle up several rounds of updates before writing the dirty buffers
> > back to disk.
> >
> > http://oss.oetiker.ch/rrdtool-trac/wiki/TuningRRD
> >
> > Wondering if any thought has been given to addressing the update
> > bottleneck problem at a lower level?
>
> in 1.3 we did address the problem of cache pollution by giving
> advice about the planned usage patterns to the OS. We also file
> access over to mmapedio. Apart from this I think the OS is the
> right place to optimize disk access.
>
> So as long as the rrdtool database format remains column oriented I
> don't see what optimization is left on that front ...
Maybe it would be worth while to consider an approach where rrd_update
calculates the rra values but caches them in application memory
instead of immediately dirtying pages in the os filesystem cache. If
rrd_fetch were able to readout rra values from this same application
cache it would go far in decoupling rrdtool from filesystem
performance.
There is a difference between this architecture and the current
rrdcache work that relies on delaying calls to rrd_update. The latter
is definitely simpler to implement but the former is more along the
lines of application level caching that is used successfully by other
database engines.
Thanks,
Scott B
More information about the rrd-developers
mailing list