[rrd-developers] rrdcached + collectd issues

Mon Oct 12 22:35:44 CEST 2009

Am 12.10.2009 20:33, schrieb Thorsten von Eicken:
> Thorsten von Eicken wrote:
>> - I'm wondering how we could overcome the RRD working set issue. Even
>> with rrdcached and long cache periods (e.g. I use 1 hour) it seems
>> that the system comes to a crawl if the RRD working set exceeds
>> memory. One idea that came to mind is to use the caching in rrdcached
>> to convert the random small writes that are typical for RRDs to more
>> of a sequential access pattern. If we could tweak the RRD creation
>> and the cache write-back algorithm such that RRDs are always accessed
>> in the same order, and we manage to get the RRDs allocated on disk in
>> that order, then we could use the cache to essentially do one sweep
>> through the disk per cache flush period (e.g. per hour in my case).
>> Of course on-demand flushes and other things would interrupt this
>> sweep, but the bulk of accesses could end up being more or less
>> sequential. I believe that doing the cache write-back in a specific
>> order is not too difficult, what I'm not sure of is how to make it
>> such that the RRD files get allocated on disk in the that order too.
>> Any thoughts?
>>
> One further thought, instead of trying to allocate RRDs sequentially,
> if there is a way to query/detect where each RRD file is allocated on
> disk, then rrdcached could sort the dirty tree nodes by disk location
> and write them in that order. I don't know whether Linux (or FreeBSD)
> have a way to query disk location or to at least infer it.
>
> TvE
Even though Linux and Windows (and I guess most other OSes) allow to
query the "logical" disc position the physical location maybe completely
unrelated to this as modern harddrives mayreallocate certain sectors if
they feel that one particular sector cannot be read\written properly.
Thus trying to infer the physical location of data will not be accurate.
To take this even further not even the volume you write onto hast to
exist physically, e.g. you might have a RAID or LVM in which case one
logical sector corresponds to more than one physical location or as in
cases of a RAM disc none at all (at least no permanent one).

To make the picture complete there's even another factor that makes this
nothing you want to do in software: Modern harddrives usually "plan"
their read and write requests automatically already. So when write
accesses occure right behind each other the harddrive will already
figure out the best way to write them - unless you enforce synchronous
request completion with e.g. O_DIRECT or thelike.

Regards,
BenBE.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
Url : http://lists.oetiker.ch/pipermail/rrd-developers/attachments/20091012/a4b1f5b0/attachment.pgp