[rrd-users] invalid unknowns?

Aragon Gouveia aragon at phat.za.net
Fri Apr 20 06:08:26 CEST 2012


*bump*

Haven't been able to solve this, and frankly it seems like a bug. 
Anyone have anything to suggest?



On 04/02/12 12:50, Aragon Gouveia wrote:
> Hi,
>
> I'm collecting data every 120 seconds and placing it into an RRD.  My
> heartbeats are set to 240 seconds.  After a few hours, my AVERAGE RRA
> starts returning unknowns for a time period it had previously returned
> valid data.  My other RRAs (MAX, LAST, etc.) return valid data for the
> same time period.
>
> I first noticed this as gaps in my graphs, and started monitoring them
> regularly as a result, suspecting my data collection scripts were
> returning no data during those time periods.  Well, I've been monitoring
> them and the gap I see right now for a 2 hour period 16 hours ago did
> not exist until a few hours ago!  It's as if the RRD file becomes
> corrupt after a while, and old data that was previously valid starts
> coming out as unknown.
>
> Since noticing this I started logging my rrdupdate commands to a text
> file to see what data is entering the RRD at what times.  The data
> entries are all valid, and never outside of a 120 second window by more
> than a second.
>
> Here's some rrdfetch output:
>
> $ rrdtool fetch health.rrd AVERAGE -s -52000 -e -50000
> 1333311120: nan nan nan nan
> 1333311240: nan nan nan nan
> 1333311360: nan nan nan nan
> 1333311480: nan nan nan nan
> 1333311600: nan nan nan nan
> 1333311720: nan nan nan nan
> 1333311840: nan nan nan nan
> 1333311960: nan nan nan nan
> 1333312080: nan nan 3.0034230333e+00 6.0000000000e+00
> 1333312200: 1.3400000000e+01 2.7000000000e+01 3.9933606000e+00
> 6.0000000000e+00
> 1333312320: 1.3400000000e+01 2.7000000000e+01 4.0000000000e+00
> 6.0000000000e+00
> 1333312440: 1.3400000000e+01 2.7000000000e+01 3.0045041083e+00
> 6.0000000000e+00
> 1333312560: 1.3400000000e+01 2.7000000000e+01 3.9936468250e+00
> 6.0000000000e+00
> 1333312680: 1.3400000000e+01 2.7000000000e+01 4.9911438667e+00
> 6.0000000000e+00
> 1333312800: 1.3400000000e+01 2.7000000000e+01 5.0000000000e+00
> 6.0000000000e+00
> 1333312920: 1.3400000000e+01 2.7000000000e+01 3.0124310667e+00
> 6.0000000000e+00
> 1333313040: 1.3400000000e+01 2.7000000000e+01 3.0000000000e+00
> 6.0000000000e+00
> 1333313160: 1.3400000000e+01 2.7000000000e+01 3.9892649250e+00
> 6.0000000000e+00
>
> $ rrdtool fetch health.rrd LAST -s -52000 -e -50000
> 1333311120: 1.3400000000e+01 2.7000000000e+01 3.0101470417e+00
> 6.0000000000e+00
> 1333311240: 1.3400000000e+01 2.7000000000e+01 3.9927562500e+00
> 6.0000000000e+00
> 1333311360: 1.3400000000e+01 2.7000000000e+01 4.0000000000e+00
> 6.0000000000e+00
> 1333311480: 1.3400000000e+01 2.7000000000e+01 4.0000000000e+00
> 6.0000000000e+00
> 1333311600: 1.3400000000e+01 2.7000000000e+01 3.0112068833e+00
> 6.0000000000e+00
> 1333311720: 1.3400000000e+01 2.7000000000e+01 4.9888617833e+00
> 6.0000000000e+00
> 1333311840: 1.3499455991e+01 2.7000000000e+01 4.0054400917e+00
> 6.0000000000e+00
> 1333311960: 1.3400839001e+01 2.7000000000e+01 4.0000000000e+00
> 6.0000000000e+00
> 1333312080: 1.3400000000e+01 2.7000000000e+01 3.0034230333e+00
> 6.0000000000e+00
> 1333312200: 1.3400000000e+01 2.7000000000e+01 3.9933606000e+00
> 6.0000000000e+00
> 1333312320: 1.3400000000e+01 2.7000000000e+01 4.0000000000e+00
> 6.0000000000e+00
> 1333312440: 1.3400000000e+01 2.7000000000e+01 3.0045041083e+00
> 6.0000000000e+00
> 1333312560: 1.3400000000e+01 2.7000000000e+01 3.9936468250e+00
> 6.0000000000e+00
> 1333312680: 1.3400000000e+01 2.7000000000e+01 4.9911438667e+00
> 6.0000000000e+00
> 1333312800: 1.3400000000e+01 2.7000000000e+01 5.0000000000e+00
> 6.0000000000e+00
> 1333312920: 1.3400000000e+01 2.7000000000e+01 3.0124310667e+00
> 6.0000000000e+00
> 1333313040: 1.3400000000e+01 2.7000000000e+01 3.0000000000e+00
> 6.0000000000e+00
> 1333313160: 1.3400000000e+01 2.7000000000e+01 3.9892649250e+00
> 6.0000000000e+00
>
> And here's some of my rrdupdate log:
>
> 1333311120 rrdtool update health.rrd -t voltage:temperature:cpu:memory
> N:13.4:27.0:3:6
> 1333311241 rrdtool update health.rrd -t voltage:temperature:cpu:memory
> N:13.4:27.0:4:6
> 1333311360 rrdtool update health.rrd -t voltage:temperature:cpu:memory
> N:13.4:27.0:4:6
> 1333311481 rrdtool update health.rrd -t voltage:temperature:cpu:memory
> N:13.4:27.0:4:6
> 1333311600 rrdtool update health.rrd -t voltage:temperature:cpu:memory
> N:13.4:27.0:3:6
> 1333311720 rrdtool update health.rrd -t voltage:temperature:cpu:memory
> N:13.4:27.0:5:6
> 1333311841 rrdtool update health.rrd -t voltage:temperature:cpu:memory
> N:13.5:27.0:4:6
> 1333311960 rrdtool update health.rrd -t voltage:temperature:cpu:memory
> N:13.4:27.0:4:6
>
> I'm no stranger to RRDtool, but this has stumped me.  Any ideas?
>
> rrdtool 1.2.30
> FreeBSD 8.2-RELEASE amd64
>
>
> Thanks,
> Aragon
>
> _______________________________________________
> rrd-users mailing list
> rrd-users at lists.oetiker.ch
> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users



More information about the rrd-users mailing list