[rrd-users] Re: RRD eating values

Marc Powell marc at ena.com
Fri Oct 8 16:43:19 MEST 2004


----Original Message----
From: rrd-users-bounce at list.ee.ethz.ch
[mailto:rrd-users-bounce at list.ee.ethz.ch] On Behalf Of Sunil Modi (IT)
Sent:
Friday, October 08, 2004 8:41 AM To: rrd-users at list.ee.ethz.ch
Subject: [rrd-users] RRD eating values

> I generated some backlogged data from January of 2004 into an RRD.  I
created
> the RRD one month at a time (tedious but RRD had problems taking all
the data
> in one large update file).  I had run a test case of the first month
to see
> how the data would look and it looked just fine when compared to excel
> graphs.    
> When I finished making the RRD and putting in all 9 months of data, I
> generated some graphs, and saw that the graphs are different and the
peak
> points are different.  The single Month Graph had a peak of 100, and
the 9
> Month graph had a peak of 60.  What happened to the peak of 100?
Should I
> make separate RRD's for each month to be more accurate?    

This is expected. If you are using a somewhat standard RRD definition
that emulates MRTG behavior, the amount of data that you are averaging
to determine your graph point increases the further back in time you go.
For example, you may keep 50 hours of 5 minute averages for your daily
view which would basically be every data point you entered with minimal
averaging. Next, you could have 12 days of 30 minute averages for your
weekly view. As you can see, you've just gone from data points every 5
minutes to data points every 30 minutes, averaged out. You lose some
granularity with that average but probably not much unless your data
points change dramatically every 5 minutes. Next, you may have 50 days
of 2 hour averages for your monthly view. Now your 5 minute averages are
themselves averaged to one point for every 2 hours. You may begin to see
a more significant loss in the granularity. Finally, for the yearly view
you may have 600 days of 1 data point per day. This is where you'll see
the highest loss in granularity. You're taking a full days worth of 5
minute averages and averaging those to just 1 data point. If this were a
traffic graph you'd be averaging both your high day traffic and your low
night traffic into one value. While it accurately reflects the average
value for the traffic over 24 hours, it doesn't really reflect your
highest or lowest values.

That is a simplification of what actually happens but the end result is
pretty much the same. Because the RRD doesn't typically store every 5
minute average for all time (unless you've defined your RRD to do that),
the averaging of the data over time results in the perception of less
accurate data, even though it's doing exactly what it's been designed
and told to do.

Creating separate RRD's for each month _may_ work but it sure seems to
me to be a bad way to go about it. You should look at your RRD
definition and craft it such that it suits your needs better.

--
Marc

--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-users
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the rrd-users mailing list