[rrd-users] Parameter problem - was: LAST always returning zero

Wes wespvp at msg.bt.com
Sun Mar 9 19:16:02 CET 2008


On 3/9/08 7:23 AM, "Alex van den Bogaerdt" <alex at ergens.op.het.net> wrote:

>> I did the above.  To verify functionality (force an undefined value), I set
>> 
>>     --end <rrdtool last> / 10 * 10 + 20
>> 
>> The graph label ($end) shows *250 and the GPRINT LAST shoes *230, there is a
>> red bar in the graph, and the GPRINT enabled/disabled is showing 0, all
>> exactly as expected.
> 
> You are displaying the opposite of queue1, meaning you display
> 1 when the line is disabled.
> 
> If you get 0, you succesfully prevented NaN from resulting in 1.

Yes, agreed.  The above example (with + 20) acted exactly as expected in all
areas.

>> Both the graph label and the GPRINT LAST show *340.  There is *no* red bar
>> in the graph.  The queue GPRINT enabled/disabled is showing 0, when it
>> should be showing 1.  I assume this is because when --end is 340, I actually
>> get through 350, per my previous email.  But then why don't I have a red
>> bar?
> 
> Why would you expect 1 ?

For example 2 (/10*10), based on rrdfetch, you are correct - I would expect
0 rather than 1, but then I would also expect a red bar. However, based on
your original recommendations of (/10*10), there seems to be a problem with
that case and I should expect a 1.

In this second example (/10*10), based on rrdfetch, if I ask for *340, the
last two rows are *340 and *350 - I don't understand why that is - this
seems like a bug to me, especially based on  your original recommendation of
using /10*10.

Based on rrdfetch, for the /10*10 example, I would expect to have it print 0
(the wrong value) because the *350 row is all nan's.  I would also expect a
red bar.  I don't have your original emails with me - I seem to remember you
saying something about rrdgraph ignoring the last data point, which would
explain this?

In example 3 of the previous email (/10*10-2) I would expect 1 because there
is no nan data. 

I added some more debug code.  In addition to your suggestions, for each
queue I am now GPRINTing

    1. The desired inverted value
    2. A straight 'LAST' with no IF
    3. A CDEF with an IF that instead normalizing nan to 1 and inverting,
prints -1 for nan, or the actual value.  I then add a VDEF LAST for this
CDEF (GPRINT complains if I use the CDEF).
    4. Timestamp of 2.
    5. Timestamp of 3.

Results are currently as expected, based on the rrdfetch testing, for a
disabled queue.  The desired value is 1 for column 1.  The columns are in
the order listed above.

  Disabled queue:
    /10*10+20:  0   0  -1   *30  *50   red bar
    /10*10:     0   0  -1   *30  *40   no red bar (no red bar expected?)
    /10*10-2:   1   0   0   *30  *30   no red bar

  Queue never referenced (all data is nan, red bar removed from code):
    /10*10+20:  0   nan  -1    0  *50
    /10*10:     0   nan  -1    0  *40
    /10*10-2:   0   nan  -1    0  *40

So I see the following potential problems:

1. /10*10 is returning an extra row of all nan's.

2. On Friday, after I added the code "-2", I was still getting random 0's
instead of 1's.  I even tried with -20.  I have not been able to duplicate
that all weekend.  I'm starting to wonder if there was some browser
funkiness - I was pointed to the wrong system/port, caching (but that
wouldn't explain it toggling back and forth), etc., that messed up my
results after I changed /10*10 to /10*10-2.

3. A CDEF with an IF will change a LAST to be the last row purely based on
--end, not the last row of data stored. This really shows up as a problem in
GPRINT when using "--end rrdtool last" or "--end rrdtool last/10*10".  Based
on your example below, this is not correct behavior?

I'll keep watching it for anomalies here, and check it out on my other
monitor tomorrow.
 
> This said, in the process I did find some weird behaviour.  I checked
> it against a much older version (1.2.10) which does not have this problem.
> 
> rrdtool graph disabled.png -a PNG -l 0 \
>         -t "Disabled queues - $END" \
>         DEF:queue1=queues:queue1_enabled:AVERAGE \
>         CDEF:queue1N=queue1,0,GT,0,1,IF \
>         VDEF:queue1L1=queue1,LAST \
>         VDEF:queue1L2=queue1N,LAST \
>         AREA:queue1N#00FF00:"Queue 1 down\l" \
>         GPRINT:queue1L1:"Last is_up  \: %1.0lf" \
>         GPRINT:queue1L1:"timestamp %s\l":strftime       \
>         GPRINT:queue1L2:"Last is_down\: %1.0lf" \
>         GPRINT:queue1L2:"timestamp %s\l":strftime       \
>         --upper-limit 4 -s $START -e $END
> 
> All input is 1.
> 
> This displays different timestamps for queue1L1 and queue1L2, which
> really should not happen.  There's something with that CDEF...

Is this because of the extra row that is returned if you ask for a multiple
of 10 - if you ask for *30, you get *40 - which is all nan's?

Wes




More information about the rrd-users mailing list