[rrd-users] trying to understand the relationship between source data, what's in rrd and what gets plotted

Tobias Oetiker tobi at oetiker.ch
Sat Jul 21 00:03:42 CEST 2007


Mark,

think about this:

you give rrdtool 3 samples with the following values:


10, 100, 990

now you ask rrdtool to plot the data. And ithappens all three
values get mapped to 1 pixel.

If you pick AVERAGE the result will be 100
If you pick MAX it will be 990
If you pick MIN it will be 10

What gnuplot does, is that it just draws ALL the data values. This
is why you get these strange, wide color areas ...

In gnuplot you would get the same result for the following to sets
of input data

10,10,990
10,990,990

in rrdtool the MIN and MAX would be equivalent too, but the AVERAGE
would be the AVERAGE and not just a wider area.

So now, I wonder why you think the gnuplot approach is accurate ...


If you want to show the range of values that went into a plot, I
suggest, you plot

MAX
AVERAGE
MIN

this will give the user a clearere picture of the values
consolidated.

hope this helps

cheers
tobi





Today Mark Seger wrote:

> ok, I really hate to be a pain but I really want to get this right too and
> have spent almost 2 solid days trying to understand this and am still puzzled!
> I think rrd is pretty cool but if it's not the right tool to be using, just
> tell me.  I believe when doing system monitoring it's absolutely vital to
> report accurate data.  I also understand some people are satisfied with
> averages but not me, especially if I'm trying to do benchmarks are accurately
> measure performance.
>
> Now for some details - I've collected data in 10 second samples for an entire
> day.  I created a rrd database with a step of 10 and capable of holding 8640
> samples so as I understand it all data should be recorded accurately and I
> believe it is.  Then I loaded it up with data.  I then examined the data by
> doing a 'fetch' and verified the correct values are stored.  When I plot the
> data using MAX, I'm missing a lot of low values and I understand it's because
> I have 8600 samples and a plot that's only 800+ pixels wide, but what I'd like
> to try an understand is does anyone know how gnuplot gets it right?  I'm
> looking at a gnuplot plot of the same data and it shows a much better
> representation of what is going on with the network.
> have a look at http://webpages.charter.net/segerm/27728-n1044-20050814-cpu.png
> which was generated by gnuplot and now look at
> http://webpages.charter.net/segerm/cpu.png which was generated by rrdtool for
> just the cpu-wait data.  this rrd plot implies there is a constant wait of
> about 20% while the other plot clearly shows it fluctuating between almost 0
> and 20.  What I want to know is how is it that gnuplot can do this and rrdtool
> can't?  They both have about the same number of pixels to play with so I'm
> guessing gnuplot is doing some more sophisticated consolidation and whatever
> that is, I'd like to suggest rrdtool offer that as an additional option.  I
> should also point out in fact the gnuplot plot is only 640 pixels wide to
> rrd's 881 so even though rrd has more pixels to play with gnuplot does a
> better job.
>
> please understand I'm only trying to understand what's happening and see if
> there's a way to improve rrd's accuracy because if people are relying on it to
> reflect and accurate picture of their environment, I think this is pretty
> important.
>
> anyone else care to comment?
>
> -mark
>
> Tobias Oetiker wrote:
> > Hi Mark,
> >
> > yes the 'lost' spike confuses people ... most, when they start
> > thinking about it, see that rrdtool does exactly the right thing,
> > it uses to consolidation method of the data being graphed to
> > further consolidate for the graph ...
> >
> > so ifyou are using MAX as consolidation function for the RRA, the
> > grapher will use MAX too. If you are averaging the data, the
> > grapher will use the same function too ...
> >
> > if you have textual suggestions for the grapher documentation I
> > will be glad to include tem
> >
> > thanks
> > tobi
> > Today Mark Seger wrote:
> >
> >
> > > Alex van den Bogaerdt wrote:
> > >
> > > > On Fri, Jul 20, 2007 at 12:31:25PM -0400, Mark Seger wrote:
> > > >
> > > >
> > > > > more experiments and I'm getting closer...  I think the problem is the
> > > > > AVERAGE in my DEF statements of the graphing command.  The only
> > > > > problem is
> > > > > I couldn't find any clear description or examples of how this works.
> > > > > I
> > > > > did try using LAST (even though I have no idea what it does) and my
> > > > > plots
> > > > > got better, but I'm still missing data points and I want to see them
> > > > > all.
> > > > > Again, I have a step size of 1 second so I'd think everything should
> > > > > just
> > > > > be there...
> > > > >
> > > > >
> > > > Last time I looked, which is several moons ago, the graphing part
> > > > would average different samples which needed to be "consolidated"
> > > > due to the fact that one was trying to display more rows than there
> > > > were pixel columns available.
> > > >
> > > >
> > > Ahh yes, I think I see now.  However, and I simply point this out as an
> > > observation, it's never good to throw away or combine data points as you
> > > might
> > > lose something really important.  I don't know how gnuplot does it but
> > > I've
> > > never see it lose anything.  Perhaps when it sees multiple data points it
> > > just
> > > picks the maximum value.  hey - I just tried that and it worked!!!
> > > This may be obvious to everyone else but it sure wasn't to me.  I think
> > > the
> > > documentation could use some beefing up in this place as well as some
> > > examples.  At the very least I'd put in an example that shows a series
> > > that
> > > contains data with a lot of values <100 and a single point of 1000.  Then
> > > explain why you never see the spike! I'll bet a lot of people would be
> > > shocked.  I also wonder how many system managers are missing valuable data
> > > because it's simply getting dropped out off.
> > >
> > > -mark
> > >
> > > > (I wrote consolidated surrounded by quotation marks because it isn't
> > > > really consolidating what's happening)
> > > >
> > > > In other words: unless your graph is 50k pixels wide, you will have
> > > > to select which 400 out of 50k rates you would like to see, or you
> > > > will have to deal with the problem in a different way. For example:
> > > >
> > > > If you setup a MAX and MIN RRA, and you carefully craft their
> > > > parameters,
> > > > you could do something like this:
> > > >
> > > > * Consolidate 60 rates (1 second each) into one (of 60 seconds).
> > > >   This means setting up an RRA with steps-per-row 60.
> > > > * Display 400 x 60 seconds on a graph (or adjust the graph width,
> > > >   together with the amount of CDPs to plot).
> > > > * Do this using (you fill in the blanks):
> > > >     DEF:MyValMin=my.rrd:minrra:...
> > > >     DEF:MyValMax=my.rrd:maxrra:...
> > > >     CDEF:delta=MyValMax,MyValMin,-
> > > >     AREA:MyValMin
> > > >     AREA:delta#FF0000:values:STACK
> > > >   (That first area does not plot anything, and it is not supposed to.
> > > >   The second area displays a line from min to max.)
> > > > * Do the same for 3600 steps per row, and 400x3600 seconds per graph
> > > >
> > > > and so on.  Of course you can adjust the numbers to your liking.
> > > >
> > > > HTH
> > > >
> > > >
> > >
> >
> >
>
>

-- 
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten
http://it.oetiker.ch tobi at oetiker.ch ++41 62 213 9902



More information about the rrd-users mailing list