[rrd-developers] [PATCH] Re: bug: "phase shift" in holt-winters predictions

Evan Miller emiller at imvu.com
Tue Aug 14 23:43:16 CEST 2007


I'm attaching a patch to rrd_update.c against trunk that fixes the 
phase-shift bug described below.

When one or more primary data point times were missed, the SEASONAL and 
DEVSEASONAL archives were marked as being up-to-date, so that they would 
not be written to. It was correct not to write to these archives, but 
the code failed to advance the pointers within the SEASONAL and 
DEVSEASONAL archives so that future updates would go to the correct 
location in the archives.

Rather than mark these archives as up-to-date (by setting 
rra_step_cnt[rra_idx] = 0), my patch allocates a new "skip_update" array 
that is set to 1 for SEASONAL and DEVSEASONAL archives that have missed 
one or more primary data points. When an RRA is written to, the cur_row 
pointer advancement happens for all archives, but the skip_update array 
is checked just before actually writing out the changes.

The bug demonstration script supplied in my previous email now passes.

The patch contains a few touch-ups as well (spelling + trivial interface 
change to write_RRA_row).

Please give it a whirl!

Evan

Evan Miller wrote:
> I'd like to report a bug that appears in RRDtool 1.2.15, 
> 1.2.99907052400, and 1.2.99907080300.
> 
> The problem: When a Holt-Winters archive misses several updates, the 
> predictions for those timestamps are postponed rather than ignored. For 
> example, if data collection halts at 1 PM and resumes at 2 PM, then the 
> predictions will be shifted by an hour; the 1:15 PM prediction will 
> appear at 2:15 PM, the 1:30 PM prediction will appear at 2:30 PM, etc. A 
> data outage can corrupt all the predictions in an archive in this way.
> 
> Below is a test script that demonstrates the problem. In the first 
> season, there is a spike at T=T_0 + 4 (where T_0 is the first 
> timestamp). If a season has length S, we'd expect a spike in the 
> prediction for T=T_0 + S + 4; however, if I introduce a data outage for 
> T=T_0 + S + 1 .. T_0 + S + 3, the prediction spike moves to T=T_0 + S + 
> 8. RRDtool apparently treats a season as a period of time for which 
> observations exist; I believe the correct behavior is to treat a season 
> as a period of time regardless of whether observations are available.
> 
> I'll try to find the source of the bug myself, but I'd appreciate any 
> insight, patches, or workarounds people might have concerning it.
> 
> Evan
> 
> 
> 
> 
> #!/usr/bin/perl -w
> 
> use RRDs 1.2015;
> use Data::Dumper;
> use strict;
> 
> # Updates an RRA, creating it if it doesn't exist
> sub rrd_update($$$$$$@) {
>      my ($path, $type, $time, $step, $readings, $archives, $heartbeat) = @_;
> 
>      $heartbeat ||= 5;
> 
>      if (!-f $path) {
>          my @args = ($path, "--start", $time - $step, "--step", $step,
>                  (map {
>                   "DS:$_:$type:".($step*$heartbeat).":U:U"
>                   } keys %$readings),
>                  @$archives);
>          RRDs::create @args;
>          return "RRDtool create error: ".join(' ', @args).RRDs::error if 
> RRDs::error;
>      }
>      my $info = RRDs::info($path);
>      RRDs::update $path, "--template", join(":", sort keys %$readings),
>              "$time:".join(":", map { $$readings{$_} } sort keys 
> %$readings);
> 
>      return "RRDtool update error on $path: ".RRDs::error if RRDs::error;
> }
> 
> sub rrd_hw_archive_definition($$$$) {
>      my ($period, $alpha, $beta, $gamma) = @_;
>      my $threshold = 3;
>      my $window    = 5;
>      return [
>              "RRA:HWPREDICT:".(2*$period).":$alpha:$beta:$period:2",  # 1
>              "RRA:SEASONAL:$period:$gamma:1",                         # 2
>              "RRA:DEVPREDICT:".(2*$period).":4",                      # 3
>              "RRA:DEVSEASONAL:$period:$gamma:1",                      # 4
>              "RRA:FAILURES:$period:$threshold:$window:4",             # 5
>              ];
> }
> 
> my $period = 8;
> my $rrd_file = "/tmp/hwshift.rrd";
> 
> unlink $rrd_file;
> 
> my $hwarchives = rrd_hw_archive_definition($period, 0.1, 0.2, 0.3);
> 
> my $step = 1;
> 
> my $time = 1000000000;
> 
> my $counter;
> 
> rrd_update $rrd_file, "COUNTER", $time, $step, { reading => $counter  = 
> 0 }, $hwarchives, 1;
> 
> # period 1
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 1 }, $hwarchives;
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 1 }, $hwarchives;
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 1 }, $hwarchives;
> # spike occurs here
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 10}, $hwarchives;
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 1 }, $hwarchives;
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 1 }, $hwarchives;
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 1 }, $hwarchives;
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 1 }, $hwarchives;
> 
> # period 2
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 1 }, $hwarchives;
> # missed updates
> $time++; $counter += 1;
> $time++; $counter += 1;
> # spike occurs here again
> $time++; $counter += 10;
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 1 }, $hwarchives;
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 1 }, $hwarchives;
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 1 }, $hwarchives;
> rrd_update $rrd_file, "COUNTER", ++$time, $step, { reading => $counter 
> += 1 }, $hwarchives;
> 
> my ($start, $rrd_step, $names, $data) = RRDs::fetch($rrd_file, 
> "HWPREDICT", "-s",
>                  $time - 1, "-e", $time - 1, "-r", $step);
> 
> print "Most recent prediction should be 1, is $$data[0][0]\n";

-------------- next part --------------
A non-text attachment was scrubbed...
Name: rrdtool-phase-shift.patch
Type: text/x-patch
Size: 15736 bytes
Desc: not available
Url : http://lists.oetiker.ch/pipermail/rrd-developers/attachments/20070814/033b1c48/attachment-0001.bin 


More information about the rrd-developers mailing list