[rrd-developers] implementing portable format
kevin brintnall
kbrint at rufus.net
Thu Oct 30 19:03:06 CET 2008
I have some more ideas on the implementation... I tried to list the
categories in increasing order of difficulty. I'm sure I'm missing a few
gotchas, but these strike me as the major categories that need work.
Looking for feedback on these... Let me know if I'm off-base.
-----------------------------------------------------------------
CHOICE OF ON-DISK ENCODING:
* estimate user base, choose most common architecture for native format
- probably i386?
* choose a specific byte-string for RRD portable NAN, INF. Conversion
routines will have to test specifically for this and convert between
"RRD NAN" and "native NAN".
-----------------------------------------------------------------
ALIGNMENT:
* Use multiples of 64-bits for all header values. The wasted space won't
amount to much, and it will work on platforms that align to either 32-
or 64-bits.
* To avoid changing the front of stat_head, we can start with this...
struct stat_head {
char cookie[4];
char version[12]; // was version[5]
...
}
Then, subsequent values can start at a 64-bit-aligned value.
strcmp(version) will work either way - the new files will just have more
'\0' at the end.
-----------------------------------------------------------------
SUPPORT FOR OLD VERSIONS:
* Create a new stat_head.version = "0005" for the new portable RRD
* How fully should we support older RRD files? Should we handle full
read/write on V<0005 files? Doing so complicates the code path, and may
introduce new bugs. Compelling users to upgrade is not pleasant, but
perhaps acceptable across major rev?
-----------------------------------------------------------------
PORTABLE VS. NATIVE FORMAT:
* Since we pass rrd_t around so many places, it's better if we have to
handle only a single type of struct in code. When we rrd_open() an
older file, create a V0005 struct. Keep the previous stat_head.version
so we can tell how to handle the file (i.e. whether we have to convert
values to/from native).
* If we keep the in-memory rrd_t.* in portable format, we have to convert
it in many places; some conversions are likely to get missed. Instead,
we should convert it to native format in rrd_open. Other code remains
largely unchanged.
* the IN-FILE rrd_t header will be in portable format
* the IN-MEMORY rrd_t header will be in machine-native format
* therefore, we can't use the mmap()'ed version directly; we'll have
to copy+convert it
* in the reverse direction, we'll have to convert it back to portable
format and memcpy() on top of the mmap version.
-----------------------------------------------------------------
FORMAT CONVERSION:
* as shown on the Wiki, we can determine the native encoding at build time
with a "union".
- we can check the byte values for a value whose encoding is well-known
- i.e. "5.44760372201160503468005645008891e-270" for doubles
- This encodes to bytes[8] = {1,2,3,4,5,6,7,8} on i386.
- This encodes to bytes[8] = {8,7,6,5,4,3,2,1} on SPARC.
==> We can use this to generate a .c/.h file that contains the
native-to-portable and portable-to-native conversion macros.
==> We can determine byte encoding at build time. We won't need a
catalog of architectures and associated conversion macros.
* I'm thinking something analogous to the nthos()/htons() functions that
can be used to convert each data type from native-format to
portable-format. i.e. htorrd_d(double), htorrd_i(int64_t) or something
similar.
* create utility functions to simplify... optionally they can determine
whether conversion is necessary based on stat_head.version.
- read and convert-to-native (double/int64_t)
- convert-to-portable and write (double/int64_t)
-----------------------------------------------------------------
Am I missing anything?
--
kevin brintnall =~ /kbrint at rufus.net/
More information about the rrd-developers
mailing list