[rrd-users] Odd mismatch in fetch results (via Perl/CLI)

Marco Marongiu brontolinux at gmail.com
Wed Jan 5 18:28:40 CET 2011


Hello *

I am fetching data from RRD files via Perl modules. For all files but
one I get the results I expect. For just one I get data as like I
shifted the start/end interval one hour back. If I fetch from the
command line using "rrdtool fetch", data comes out as expected.

I am using:
Ubuntu Linux 10.04.1 LTS
rrdtool 1.3.8 (ubuntu package)
RRDs version 1.3008 (perl -MRRDs -le 'print $RRDs::VERSION')
RRDTool::OO 0.28 (perl -MRRDTool::OO -le 'print $RRDTool::OO::VERSION')

If you think you can help, please keep reading. Detailed explanation
follows.

------------------------------------------------------------------------

I am creating a report using data from ~300 RRD files. All but one comes
from cacti, while the last one is created and updated by a Perl script
of mine. All the files come from the same machine, and the system clock
is set to UTC.

My script uses RRDTool::OO which, in turn, uses the RRDs module. It
parses an XML file, fetches some numbers, and then calls:

$rrd->update(values => $hostmetric{$metric}) ;

which results in update being called with a timestamp of "N".

When the report is due, I dump all the RRD files to XML, copy them over
to a "reporting machine", where I rebuild the RRD files from XML,
aggregate some of them in fewer RRD files, and then make the report. The
"special" file mentioned above is also rebuilt and used unchanged.

To aggregate data, for each RRD file I call:

      my ($start,$step,$names,$data) =
        RRDs::fetch($sourcefile,$cf,
                    '--start',$tsstart,
                    '--end',  $tsend) ;

$tsstart and $tsend are always the same for all RRDs.

For each RRD, I convert timestamps to a readable form using:

        @$timestamps =
          map { scalar gmtime( $start + $_ * $step ) } (0..$#$data) ;

(note gmtime is used here).

The strange stuff happens here. While for the aggregated files I get the
data I expect: e.g.:

Wed Dec  1 00:00:00 2010	10,61
Wed Dec  1 01:00:00 2010	10,61
Wed Dec  1 02:00:00 2010	10,34
...

data from my RRD file shows up as like I called fetch with the start/end
interval shifted one hour back:

Tue Nov 30 23:05:00 2010
Tue Nov 30 23:10:00 2010
Tue Nov 30 23:15:00 2010
...
Fri Dec 31 22:55:00 2010
Fri Dec 31 23:00:00 2010
Fri Dec 31 23:05:00 2010

This means that I get one hour of NaNs at the top of the report, and one
hour less data at the bottom.

At first, I thought it depends on the way the files are updated (since
the start/end timestamps don't change). While, as said, my "special"
file is updated with an implicit timestamp, my aggregate RRDs are
updated with explicit timestamps, as in:

    $aggrrd->update( time   => $ts,
                     values => $combined{$ts} ) ;

which will call update with a timestamp of $ts.

But this doesn't make sense: if I use "rrdtool fetch" directly from the
command line and with the same parameters as the script, I get the
values I expect:

$ rrdtool fetch rrd/t*.rrd AVERAGE -s 1291158000 -e 1293836400 | awk
'/:/ { print $1 }' | head -n 3
1291158300:
1291158600:
1291158900:

Any ideas?

Thanks in advance

Ciao
--bronto



More information about the rrd-users mailing list