[rrd-developers] [PATCH,RFC] optional mmap based file I/O

Bernhard Fischer rep.dot.nop at gmail.com
Thu May 31 01:08:52 CEST 2007

On Wed, May 30, 2007 at 11:37:02PM +0200, Tobias Oetiker wrote:
>> > A further optimization for bulk updates would be to always write a
>> > full block of data in one go, this would prevent the os from
>> > reading the current block back in ...
>> We don't really need to do anything to get this optimization since
>> it is essentially what happens already with asynchronous writes by

No. We still want to batch as much ops as possible on a level as high as
possible. Otherwise, we risk to evict hot areas too early since we do
not know that we are about to write to this or an adjacent page soon.

>> update/bdflush/pdflush, as long as you do a bunch of updates in a
>> small number of seconds on a page that is in cache, the writes get
>> coalesced (in each dirty page), and written out together.  This is

This is not accurate. Think about calling 'update' on 1<<n rrd's
setting 10 values for each. The earlier you coalesce those to hit rrd[n]
with as much updates at once as possible, the better. As tobi mentiones,
if we can do a full block and then spot-set individual vals, we should
try to do this as much as possible.

>> what we've observed with Dale Carder's RRDCache application level
>> cache strategy.  The CSV "journal" method that Kevin mentioned recently
>> sounds essentially the same: i.e just save up the update arguments
>> in a journal, then group them by RRD file and call update many times
>> in quick succession, periodically.
>yes, 'if the block is in cache' ... if the block is not in cache,
>it will have to be read first ... where as if we write a whole block
>in one go, there is no need to read the previous state back from
>disk first, which should result in a 50% speedup ... (not tested)

50% sounds a bit too much, but it should be vaguely noticeable, agree.
>> Oh, I misunderstood.  You're trying to decide whether to test page
>> size at run-time (or compile-time)?  I'd do run-time: sysconf()
>> or getpagesize() must be nominal operations.
>yep it seems so ...

the 8k on alpha i don't know about. ia64 has a pagesize of 16k per
default, and i don't currently think about folks who use 4M or whatever
PAE/PGE stuff that are concerned about rrd. Feel free to care about those ;)

That said, sysconf at runtime should do what it is supposed to do, imho.


More information about the rrd-developers mailing list