[rrd-developers] Re: Restarting devices

Y. Brett Sauer brett at wamnet.com
Mon Jan 31 17:08:21 MET 2000


Hi.  I implemented Alex's solution below (*way* below) for the
restarting-devices scenario and did a little testing.  I read the data
in from two files, one for cust1 and one for cust2, where the only
difference is that cust2 experienced a counter reset approximately 500
seconds ago, and there is a data point missing just previous to the
reset (otherwise, RRDtool update complained about attempting to update
with a smaller timestamp than that of last update).

Realizing that my numbers are impossible in the real world, I am
disappointed at the appearance of the 132917.92 and 128092.54 values for
average utilization in octets/second.  The previous calculated value for
utilization was, presumably, changed to a NaN in the archive, because it
exceeded the maximum allowable value, but the corresponding counter
value (or is it the utilization value?) appears to have been taken into
account in the next calculation.  Am I correct about this discrepancy?


Below is the input data for cust1 and cust2, respectively, and the fetch
results for both customers' data.

22 ds :) cat cust1
#sysup    tstamp    octetsin        octetsout
70000 949000207 2848180237 2744742304
70000 949000508 2848192085 2744807516
70000 949000807 2848192085 2744807516
70000 949001107 2848206564 2744818912
70000 949001407 2848207042 2744819185
70000 949001707 2848207648 2744819601
70000 949002007 2848207648 2744819601
70000 949002308 2848208129 2744819897
70000 949002607 2848208129 2744819897
70000 949002907 2848278730 2745047871
70000 949003207 2848294885 2745126310

23 ds :) cat cust2
#sysup    tstamp    octetsin        octetsout
70000 949000207 2848180237 2744742304
70000 949000508 2848192085 2744807516
70000 949000807 2848192085 2744807516
70000 949001107 2848206564 2744818912
70000 949001407 2848207042 2744819185
50000 949002007 2848207648 2744819601
70000 949002308 2848208129 2744819897
70000 949002607 2848208129 2744819897
70000 949002907 2848278730 2745047871
70000 949003207 2848294885 2745126310


18 ds :) head -20 fetch1.cust1.brett
                 octetsin     octetsout
 948999900: nan0x7fffffff nan0x7fffffff
 949000200: nan0x7fffffff nan0x7fffffff
 949000500:         39.36        216.65
 949000800:          1.05          5.78
 949001100:         47.14         37.10
 949001400:          2.68          1.78
 949001700:          2.01          1.38
 949002000:          0.05          0.03
 949002300:          1.56          0.96
 949002600:          0.04          0.03
 949002900:        229.85        742.18
 949003200:         58.08        273.09
 949003500: nan0x7fffffff nan0x7fffffff
 949003800: nan0x7fffffff nan0x7fffffff
 949004100: nan0x7fffffff nan0x7fffffff
 949004400: nan0x7fffffff nan0x7fffffff
 949004700: nan0x7fffffff nan0x7fffffff
 949005000: nan0x7fffffff nan0x7fffffff

43 ds :) head -20 /tmp/fetch1.cust2.brett
                 octetsin     octetsout
 948999900: nan0x7fffffff nan0x7fffffff
 949000200: nan0x7fffffff nan0x7fffffff
 949000500:         39.36        216.65
 949000800:          1.05          5.78
 949001100:         47.14         37.10
 949001400:          2.68          1.78
 949001700: nan0x7fffffff nan0x7fffffff
 949002000: nan0x7fffffff nan0x7fffffff
 949002300:     132917.92     128092.54
 949002600:          0.04          0.03
 949002900:        229.85        742.18
 949003200:         58.08        273.09
 949003500: nan0x7fffffff nan0x7fffffff
 949003800: nan0x7fffffff nan0x7fffffff
 949004100: nan0x7fffffff nan0x7fffffff
 949004400: nan0x7fffffff nan0x7fffffff
 949004700: nan0x7fffffff nan0x7fffffff
 949005000: nan0x7fffffff nan0x7fffffff


Below is the debug output for the updates and some of the Perl code.

[1]:[cust1]
create_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool create
/usr/local/rrd
tool-1.0.10/data/testasusage/cust1.rrd --step 300
DS:octetsin:COUNTER:600:0:2560
00 DS:octetsout:COUNTER:600:0:256000 RRA:AVERAGE:0.5:1:2304
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust1.rrd 949000207:2848180237:2744742304
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust1.rrd 949000508:2848192085:2744807516
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust1.rrd 949000807:2848192085:2744807516
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust1.rrd 949001107:2848206564:2744818912
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust1.rrd 949001407:2848207042:2744819185
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust1.rrd 949001707:2848207648:2744819601
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust1.rrd 949002007:2848207648:2744819601
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust1.rrd 949002308:2848208129:2744819897
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust1.rrd 949002607:2848208129:2744819897
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust1.rrd 949002907:2848278730:2745047871
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust1.rrd 949003207:2848294885:2745126310
update_RRD successful

[2]:[cust2]
create_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool create
/usr/local/rrd
tool-1.0.10/data/testasusage/cust2.rrd --step 300
DS:octetsin:COUNTER:600:0:2560
00 DS:octetsout:COUNTER:600:0:256000 RRA:AVERAGE:0.5:1:2304
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust2.rrd 949000207:2848180237:2744742304
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust2.rrd 949000508:2848192085:2744807516
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust2.rrd 949000807:2848192085:2744807516
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust2.rrd 949001107:2848206564:2744818912
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust2.rrd 949001407:2848207042:2744819185
update_RRD successful
sysuptime (in timeticks): [50000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust2.rrd 949001506:U:U 949001507:0:0
949002007:284
8207648:2744819601
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust2.rrd 949002308:2848208129:2744819897
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust2.rrd 949002607:2848208129:2744819897
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust2.rrd 949002907:2848278730:2745047871
update_RRD successful
sysuptime (in timeticks): [70000]
update_RRD executing /usr/local/rrdtool-1.0.10/bin/rrdtool update
/usr/local/rrd
tool-1.0.10/data/testasusage/cust2.rrd 949003207:2848294885:2745126310
update_RRD successful


$main::start_time = 949000000;

%main::customers = (
    1 => 'cust1',
    2 => 'cust2',
);
# minimum value for sysUpTime indicating that device has not been
recently reset
$main::min_sys_uptime = 60000; #600 seconds

$main::DS_min = 0;
$main::DS_max = 256000;

foreach (sort{$a<=>$b} keys(%main::customers)) {
my $customer = $main::customers{$_};

    ## Call sub to create RRD for new customer;
    ## does nothing if customer RRD already exists
    if (! defined create_RRD($customer)) {
        warn "create_RRD failed for [$customer]\n";
    }

    $main::input_data_file = join '/', $main::input_data_dir, $customer;

    my $fh = new FileHandle("$main::input_data_file");
    my @data = <$fh>;
    close $fh;

    foreach (@data) {
        my $reset = 0;
        chomp $_;
        my ($sysuptime, $tstamp, $octetsin, $octetsout) = split /\s+/,
$_;
        ## Check to see whether device was recently reset
debug("sysuptime (in timeticks): [$sysuptime]");
        if ($sysuptime < $main::min_sys_uptime) {
            $reset = 1;
        }
        ## Check for device reset flag
        if ($reset > 0) {
            ## device appears to have been reset;
            ## first insert unknown value into RRD for interval prior to
reset
            ## then estimate last value (zero), based on sysUpTime,
            ## and update RRD with this and current value

            my $previous_data = join ':',
${tstamp}-(${sysuptime}/100)-1, 'U', 'U';
            my $reset_data = join ':', ${tstamp}-(${sysuptime}/100), 0,
0;
            my $current_data = join ':', $tstamp, $octetsin, $octetsout;

            if (! defined update_RRD($customer, $previous_data,
$reset_data, $current_data)) {
                warn "update_RRD failed for [$customer]\n";
            } else {
                debug("update_RRD successful");
            }
        } else {
            ## Update RRD with new data
            if (! defined update_RRD($customer, (join ':', $tstamp,
$octetsin, $octetsout))) {
                warn "update_RRD failed for [$customer]\n";
            } else {
                debug("update_RRD successful");
            }
        }
    }
}

sub update_RRD {
    my ($host, @data) = @_;
    my $data_string;
    if (scalar(@data) > 1) {
        $data_string = join ' ', @data;
    } else {
        $data_string = $data[0];
    }
    debug("update_RRD executing ${main::RRD_exec_path}/rrdtool update
${main::RRD_data_path}/${host}.rrd $data_string");
    chomp(my $ret = `${main::RRD_exec_path}/rrdtool update
${main::RRD_data_path}/${host}.rrd $data_string 2>&1`);
...
}


Thanks much.

Brett
--

Alex van den Bogaerdt wrote:

> Luis F Balbinot wrote:
> >
> > I had the same spike problems that Bert Driehuis had when the
> equipment is
> > reset, which makes all my graphs unusable. Yes, I can force the min
> and max
> > with 0 and the ifSpeed value, but it might not work all the time.
> Some
> > spikes might appear.
> >
> > What should I do? My idea was to check manually if the current value
> is
> > lower than the last value, and then store a NaN in this case.
> >
>
> This is a workaround that might work.  The proper solution would be to
>
> check for the device reset.  If sysUptime has a low value, the device
> will have been reset.
>
> Suppose you take samples about 300 seconds apart.  Suppose sysUptime
> is
> 29000.  In that case, the device must have been reset between the
> current
> and the previous update.  To be on the safe side, assume the device
> has
> been reset if sysUptime is lower than 600 seconds (60000).
>
> (note:
>      I assume here that sysUptime is in hundreds of seconds.
>      is this always true?)
>
> To signal the unknown interval caused by the reset, the front end
> should
> insert "U" into the database.
>
> If the current time is 948669019, just insert "U" one second before
> and
> insert the correct value with the current timestamp.  Including the
> last
> update this would look like:
>
> previous:   rrdtool update my.rrd 948668705:1000
> current:    rrdtool update my.rrd 948669018:U 948779019:500
> future:     rrdtool update my.rrd 948669309:1200
>
> The current update inserts "U" and therefore the interval from
> 948668705
> to 948669018 is unknown.  It also inserts 500 at time 948779019 which
> happens to be the current time.
>
> The interval from 948779019 to 948669309 will be known as there are
> both
> start and end times available.  You only loose the update from
> 948668705 to 948779019 and this is, indeed, unknown due to the reset.
>
> Real brave people use sysUptime and current time to estimate when the
> reset took place.  Assuming sysUptime had a value of 10000, you could
> do:
>
>    previous:   rrdtool update my.rrd 948668705:1000
>    current:    rrdtool update my.rrd 948668918:U 948668919:0
> 948779019:500
>    future:     rrdtool update my.rrd 948669309:1200
>
> This minimizes the unknown data.
>
> regards,
> --
>    __________________________________________________________________
>  / alex at slot.hollandcasino.nl                  alex at ergens.op.het.net
> \
> | work                                                         private
> |
> | My employer is capable of speaking therefore I speak only for myself
> |
> +
> ---------------------------------------------------------------------+
>
> --
> Unsubscribe
> mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
> Help        mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
>
> Archive     http://www.ee.ethz.ch/~slist/rrd-developers

--
Brett Sauer                     Network Management Engineer
cell/pgr:(612)889-6397          Central Operations
phone:(512)252-8590             WAM!NET Inc.
vmail:(651)256-5467             14437 Robert I. Walker Blvd
brett at wamnet.com                Austin, TX 78728




--
Unsubscribe mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-developers



More information about the rrd-developers mailing list