[rrd-developers] implementing portable format

Thu Oct 30 20:07:01 CET 2008

kevin brintnall wrote:
> On Thu, Oct 30, 2008 at 01:42:27PM -0500, Sfiligoi Igor wrote:
>>> PORTABLE VS. NATIVE FORMAT:
>>>
>>> * Since we pass rrd_t around so many places, it's better if we have to
>>>   handle only a single type of struct in code.  When we rrd_open() an
>>>   older file, create a V0005 struct.  Keep the previous stat_head.version
>>>   so we can tell how to handle the file (i.e. whether we have to convert
>>>   values to/from native).
>>>
>>> * If we keep the in-memory rrd_t.* in portable format, we have to convert
>>>   it in many places; some conversions are likely to get missed.  Instead,
>>>   we should convert it to native format in rrd_open.  Other code remains
>>>   largely unchanged.
>>>
>>>   * the IN-FILE rrd_t header will be in portable format
>>>   * the IN-MEMORY rrd_t header will be in machine-native format
>>>   * therefore, we can't use the mmap()'ed version directly; we'll have
>>>     to copy+convert it
>>>   * in the reverse direction, we'll have to convert it back to portable
>>>     format and memcpy() on top of the mmap version.
> 
> Igor,
> 
>> I can see performance issues with this.
>> Often only a small fraction of a RRD file is read.
> 
> The RRD header is the only part that is read every time (this happens
> already).  That is absolutely necessary to determine the RRD geometry
> (ds_def, rra_def, rra_ptr, etc.).  The header is usually VERY small
> compared to the rest of the file.
Hi Kevin.

I misunderstood you previous mail.
This makes indeed a lot of sense ;)

> 
>> With your proposal, the whole file needs to be loaded (and converted)
>> every time.
> 
> I am only advocating that we read+convert the header in rrd_open; not the
> entire file.  Then, we can deal with the header in native format.
> Ultimately, the header is converted before being written back to the RRD
> file.
> 
> As far as the values, RRDTool already read/writes only the relevant parts
> of the file.  So, that's where we'd need to do the data conversion..  We
> don't have to touch other parts of the file.
OK.

> 
> As Tobi already noted, the bit-conversions run at a rate that greatly
> exceeds the overall processing rate of the RRD file.
Just don't bank on it.

At least in my case, rrd read/writes are just one of the processes
running on the node.
Even if the rrd operations would not be any slower on an idle system, on
a heavily loaded one every CPU cycle counts.

Cheers,
  Igor