[rrd-users] Input values normalization
Alex van den Bogaerdt
alex at vandenbogaerdt.nl
Tue Feb 18 18:44:14 CET 2014
> But the accuracy will still only be as good as the collection frequency
> (and it's relation to the rate of change of the measured value). If the
> measured value can (for example) rise sharply and drop back again in
> between samples then even the max function won't tell you anything about
> it.
Altough your point is valid, it is (IMHO) irrelevant for this particular
question. In fact, the higher the sampling resolution, the smaller the
problem of missing such spikes.
In the case at hand the OP wants to keep his discrete values. Average won't
do, so MIN MAX or LAST remain, and then I would choose to use both MIN and
MAX. Maybe the OP will choose differently.
With proper heartbeat settings there is no need to take a sample every
second either. It could remain at every 5 minutes or so.
Anyway, the main point of my answer was to have the 'best' RRA with 'steps'
larger than 1. I'll elaborate.
> The solution I found is to set a data source step to 1 second to avoid
> normalization, but this produces big rrd files with a lot of redundant
> information.
>
> I did not find a satisfactory solution up to now, thanks for any hint.
Something has to give, so if not increasing the size of the database is a
must, then somehow RRDtool needs to combine several of its input values into
one. Averaging them is undesirable, then choose one or any of MIN, MAX and
LAST.
Instead of having each RRA row being 1 times 300 seconds in an RRD with
'step==300', the same amount of time is stored (and thus not a bigger
database) when having RRA rows of 300 times 1 second, in an RRD with
'step==1'.
Let's assume the original database was like this:
created database with "--step 300" (could be left out, as is default)
RRA:AVERAGE:0.5:1:1200 (100 hours: 300 seconds per row, 1200 rows)
Now when creating the database with "--step 1", without increasing its size:
do not specify RRA:AVERAGE:0.5:1:360000 (100 hours: 1 second per row)
but instead specify RRA:AVERAGE:0.5:300:1200 (100 hours, 300 seconds per row
again)
This would still have fractions in the database, so the next step is to
alter the consolidation function as suggested before. As long as RRDtool is
fed with integer timestamps and when it has '--step 1', normalization will
be a no-op, and the input will be untouched in this phase. Then during
consolidation the integer values are kept, which I believe was the goal.
HTH
Alex
More information about the rrd-users
mailing list