[rrd-users] Calculating AVG, treating UNKNOWNs as zero

Alex van den Bogaerdt alex at vandenbogaerdt.nl
Thu Aug 2 16:13:53 CEST 2012

----- Original Message ----- 
From: "Derek Haynes" <derek.haynes at highgroove.com>
To: "Alex van den Bogaerdt" <alex at vandenbogaerdt.nl>
Cc: <rrd-users at lists.oetiker.ch>
Sent: Wednesday, August 01, 2012 10:54 PM
Subject: Re: [rrd-users] Calculating AVG, treating UNKNOWNs as zero

> Hi Simon/Alex,
>> use a CDEF like b=a,UNKN,0,a,IF
> I don't believe this works as I'm fetching from a larger step size (30
> minutes) than the RRA (1-minute) and the CDEF performs the RPN logic
> on the DEF at the 30-minute step. At this point, the average already
> is ignoring the unknowns.

That is also what I thought, but I also think this is why you want equal 
sized buckets of data, so updating at time t-1 is wrong (for you) regardless 
of which value you put in.  When averaging, you want the 12 in {10,11,12} to 
be of the same weight as the 12 in {0,0,12}.

>> Details may vary, depending on the information you left out, but IMHO 
>> this
>> is the path to your solution
> Thanks - based on the high-level summary I outlined, your solution
> works. However, there's a performance reason why it won't for our
> setup:
> * Before updating a file, I don't know when the last update occured.

But do you at least know it will have happened a whole number of minutes 

> We're updating many files at once - for performance, we want to avoid
> reading the lastupdate of each RRD file before updating to determine
> if we need to proceed the update with time-60:0. If a file received an
> update in the previous minute, updating again with zero will result in
> an error.

And is that a big problem?  I never measured it, I would guess there is a 
little performance penalty because rrdtool will output an error message for 
you to ignore, but maybe I'm wrong and maybe that does take a lot of 
resources (on a large scale setup).

> * At most, we update these files once per-minute. Updating w/an
> UNKNOWN one-second prior is always safe.

But at the same time it is always what you not want. You do not want to 
update at T-1, and you want to avoid unknowns.  I really believe you should 
step back and look at your real problem. What you are trying to do now is to 
solve another problem, which you introduced yourself as a result of a wrong 
fix. Forget about the unknowns; inserting them is not the solution to your 
problem. Do not insert them, and you do not need to fix this problem.

Will it happen often that there was an update in the previous minute? If 
not, you probably can ignore the extra cycles needed to output an error 
message.  Just try updating the previous minute with zero. If that fails a 
few times, you probably not need to care. If it does happen a lot, maybe you 
need to find a way for your application to remember if it updated the 
database or not.


More information about the rrd-users mailing list