[rrd-users] Disabling Last Update
kubicaryan at yahoo.com
Thu Nov 8 08:52:14 CET 2012
The short answer is No.
RRD will not take out of order data; nor should it. CF RRAs are calculated dependent on RRA; and RRA CFs are based on insert(s) for that interval and immutable once the next interval is written to.
commit order should always be governed by an SLA; if you have 20 second interval batches and wish to commit that data as soon as possible you can 'back-off' in your queue a specific period of time (the SLA) which would permit some latent data to come in out of order to be sorted before commits. Generally, out-of-order issues don't occur even at high-rate commits (1s) as long as the producer of the data remains the same (ie: it is submitting serially so it can't itself submit out of order.)
Your backlog during a failure? Well... either:
a) design around that failure scenario with a persistent fault-tolerant queue
b) design the queue to only go through a single server for a series of data (ie: so if it goes down then all data destined for it is backlogged, thus no out-of-order issue)
c) accept the loss; after all you had an outage
d) only store one datasource per rrd file; 2+ ds per file runs the risk of coordination and severely hinders (a) and (b)
e) all of the above
f) if you really must have a database (datastore) accepting of out of order commits; don't use rrdtool
g) you can rewrite the rrd, xport the data, calculate what it should be, then rebuild a new rrd ( I don't suggest this, but it's doable )
From: Wesley Wyche <wesley.wyche at verizon.com>
To: rrd-users at lists.oetiker.ch
Sent: Wednesday, November 7, 2012 9:39 PM
Subject: [rrd-users] Disabling Last Update
Is there a way to disable the last update value (or at least override it
somehow)? I need to insert/update data in a NON-sequential manner based
upon MY time value, not a step from last update.
The scenario is this:
I have data coming into a landing area in batches. Those batches are data
updates for thousands of rrds and they come in several per minute. However,
there may be times where there is a backlog of batches that need to be
processed offline perhaps due to a server outage or processing requirements
Even if I'm processing data that is "current", there are still multiple
batch files for that single minute of time. We could be processing the data
out of order within that single minute because the sample rate is every 20
seconds and I could update the rrd with the wrong order (due simply by
processing that minutes batches in alphanumeric order by filename).
It would be awesome if there was a way I didn't have to simply chunk hours
of data batches away if I have a processing server go down and i can't keep
up with the volume. At some point, the server problem would be fixed
eventually and then it would start processing the backlog of data and
inserting it into the RRDs in the PAST where it should have been inserted
and belongs instead of the NOW.
View this message in context: http://rrd-mailinglists.937164.n2.nabble.com/Disabling-Last-Update-tp7580590.html
Sent from the RRDtool Users Mailinglist mailing list archive at Nabble.com.
rrd-users mailing list
rrd-users at lists.oetiker.ch
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rrd-users