[rrd-developers] rrd file access code

Daniel.Pocock at barclayscapital.com Daniel.Pocock at barclayscapital.com
Mon Oct 13 22:01:34 CEST 2008


> -----Original Message-----
> From: Tobias Oetiker [mailto:tobi at oetiker.ch] 
> Sent: 13 October 2008 19:44
> To: Pocock, Daniel: IT (LDN)
> Cc: rrd-developers at lists.oetiker.ch
> Subject: RE: [rrd-developers] rrd file access code
> Today Daniel.Pocock at barclayscapital.com wrote:
> > To maximise the benefit of the striping algorithm, there is some 
> > additional code (which I'm yet to complete) which ensures 
> that a newly 
> > created RRD will have it's first CDPs stored at the same 
> offset within 
> > the RRD as the other stripes.  I'll try to generalise this 
> concept to 
> > work within your new framework.  Do you see any problem 
> with bumping 
> > the offset of a new RRD file in this way?
> no sure what you mean by this ? you want to synchronize the 
> RRA pointers ? this is certanly possible. At the moment they 
> actually get randomized, so that not all RRDs flip to the 
> next diskblock at the same time ... :-) this reduces disk 
> impact since whenever a new block gets touched it has to be 
> read into cache first ...

Yes, the RRA pointers.

I think I've found your randomisation code in rrd_create.c (the call to

Here is what I would like to do:

- rrd_create will not call rra_random_row() directly - there will need
to be a method in rrd_open.c, perhaps call it rra_select_initial_row()

- There will be a default implementation of rra_select_initial_row(),
based on your code

- Each time rrd_update() is invoked, it will call a method in
rrd_open.c, perhaps call it rra_notify_current_row()

- The filesystem backend may choose to store the most recent value from
rra_notify_current_row(), and use that to help decide what value to
return for calls to rra_select_initial_row()

- For tighter synchronisation, it would be nice to have the pointer left
un-initialised in rrd_create().  The pointer should be initialised on
the first call to rrd_update().

You are quite right about the disk block issue - however, in my striping
situation, all the RRAs (potentially thousands of them) will be striped
over the same disk blocks, so the disk blocks representing the current
row will be served by cache hits on read and write.

I notice that the mmap code is quite tightly integrated.  Do you see
that becoming a run-time option?  It would be useful to be able to
select between regular file IO, mmap or some other custom implementation
at run-time.



This e-mail may contain information that is confidential, privileged or otherwise protected from disclosure. If you are not an intended recipient of this e-mail, do not duplicate or redistribute it by any means. Please delete it and any attachments and notify the sender that you have received it in error. Unless specifically indicated, this e-mail is not an offer to buy or sell or a solicitation to buy or sell any securities, investment products or other financial product or service, an official confirmation of any transaction, or an official statement of Barclays. Any views or opinions presented are solely those of the author and do not necessarily represent those of Barclays. This e-mail is subject to terms available at the following link: www.barcap.com/emaildisclaimer. By messaging with Barclays you consent to the foregoing.  Barclays Capital is the investment banking division of Barclays Bank PLC, a company registered in England (number 1026167) with its registered office at 1 Churchill Place, London, E14 5HP.  This email may relate to or be sent from other members of the Barclays Group.

More information about the rrd-developers mailing list