[rrd-developers] Update ex post?
Sebastian Harl
sh at tokkee.org
Tue Aug 24 12:44:56 CEST 2010
Hi,
On Mon, Aug 23, 2010 at 05:28:03PM -0700, Thorsten von Eicken wrote:
> On 8/23/2010 2:13 PM, Sebastian Harl wrote:
> > So, I don't see any reasonable way to solve that with the current
> > architecture (and I don't see any way how this should be improved).
> I believe there is disagreement about what the problem is. You define
> the problem as "insert values in the past in a manner that is
> indistinguishable from having inserted them at the correct moment in the
> first place". Your analysis is correct for that problem.
>
> In real life, the problem tends to be a different one, which is "insert
> values in the past in a manner that improves on the current form of the
> data, i.e., that makes the resulting RRD more useful than without these
> insertions." Put differently, I prefer some inaccuracy in the data over
> having data gaps or garbage data in my RRD.
Yep, that's what I realized after writing most parts of my E-mail and
why I've then added "well, unless you accept rather vague approxima-
tions" ;-)
Anyway, in that case, the problem is how to (automatically) detect *how*
to make the RRD more useful. However, that responsibility could be given
to the user. RRDtool could provide a few different mechanism how to
solve it and the user gets to chose which one to use. Something like the
following comes to my mind off the top of my head:
* Use the new value for all CDPs affected by the change. I assume that
this would be the most commonly used approach and should be a
reasonable approach to fixing spikes in the data/graph or remove
undefined CDPs.
* Use the average (min, max, $some_more_complex_function, ...) of the
CDP before and after the one that's affected and ignore the specified
new value (or use the new value as a parameter to the function as
well). This might also be a reasonable approach to fix spikes and
undefined values but can also be used for more powerful stuff.
And while we're at it, we could also think about using more than two
surrounding CDPs to calculate the updated CDP and, e.g., do some kind
of function fitting. I don't think there's any actual need for that
but it would be a cool feature and possibly fun to implement ;-)
* Assume that the (possibly preprocessed) value of the CDP is the value
of all PDPs that were used to create the CDP and then apply something
like the "replace" function I was talking about in my other E-mail to
update the CDP. I'm not entirely sure there would be a use-case for
that, though, but it shouldn't be too hard to implement ;-)
What other real-world use-cases, other than removing spikes or undefined
values, are out there? Those are the only ones I've encountered so far.
> What many of us have built are tools that get at the second problem. It
> would be great if the RRD maintainers could accept some of these tools
> in the spirit of "sometimes the perfect is the enemy of the good".
So, what's the approach those tools are using (I've never looked at any
of those [nor have I looked *for* any ;-)])?
Cheers,
Sebastian
--
Sebastian "tokkee" Harl +++ GnuPG-ID: 0x8501C7FC +++ http://tokkee.org/
Those who would give up Essential Liberty to purchase a little Temporary
Safety, deserve neither Liberty nor Safety. -- Benjamin Franklin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: Digital signature
Url : http://lists.oetiker.ch/pipermail/rrd-developers/attachments/20100824/48b96d2d/attachment.pgp
More information about the rrd-developers
mailing list