[rrd-users] rrdtool theory ... (LONG)

Tobias Oetiker oetiker at ee.ethz.ch
Wed Aug 23 08:00:38 MEST 2000


Folks,

finally some life in this group cool ... in response to all the posts I saw,
here are some thoughts on the points raised in the discussions:

* yes, I do welcome discussion on this the topic.

* I do think some document on the issue might be very helpful (Alex :-) ?)
  because the topic keeps popping up.

* I do maintain that, for all purposes where you have some information
  about the nature of the data you are monitoring, rrdtool does the right
  thing. I will try to draw a coherent picture later on.

* If you use rrdtool update several times in one step interval, ALL the
  data you give to rrdtool will be taken into account. Internally rrdtool
  counts the space below the curve built by the data points you feed it and
  then when a interval time arrives it stores the accumulated space divided
  by the interval time. This also takes unknowns into account which may
  occur during the interval if frequent updates happen. (Alex: So yes, It
  does help altering the sampling interval without altering the step)

* Even when monitoring discrete things like modems in use, errors occurred in
  a certain time period or people in a room, there are situations where you
  will end up with a non integer number.
 
  - number of people in the room over the last hour
    built from data acquired by counting the number of people in the room
    every 2 minutes
  
  - errors per second, but only 3 errors occurred during the last hour

  - number of modems in use last the night, built from a log listing every
    call.

* When you are sampling data, it is VITAL to know about what you are looking
  for. It makes no sense counting the people in the room every hour when the
  number of people fluctuates a lot over the course of an hour. The data
  collected will be more or less worthless because if you had counted 5
  minutes earlier or 7 minutes later the result would have been different by
  many %. This means that you have to choose your sampling interval in a
  manner so that the AVERAGE value built in a certain time interval is NOT
  significantly dependent on the number of samples taken during that
  interval. If this is not the case then your sampling interval is WRONG and
  you can as well abort the exercise and take some RANDOM data.

* Once you have chosen an appropriate sampling interval you might also want
  to look at the MAX data. Now here we run into another problem. Depending
  on the nature of the data we monitor MAX always approaches 100% the shorter
  the sampling interval is. (network traffic comes to mind) So even with MAX
  consolidation it is important to know what the sampling interval was.
  Taking the modem example: If over the course of the night, the MAX modem
  use @ my Internet provider was 100% this sounds bad. But it is not the
  full picture. Because the 100% are totally different when I know that
  the MAX 30 minute AVERAGE was 100% or the MAX 1 minute average was 100%.
  Together with the information that the AVERAGE use was 15% I can actually
  build an opinion. The 100% alone is not interesting at all.

* If you want to store RAW data, use a database ... there are plenty
  available ... create a table with a time stamp and a number of data
  entries.

* RRDtool is build FROM GROUND UP on the RE-SAMPLING idea. Data gets stored
  at predefined points in time and the values get adjusted to the best of our
  knowledge ... This can not be changed easily ... look at the code and you
  will see, everything builds on it.

* Doing the re-sampling and consolidation on the fly is at the core of
  rrdtool. True it takes a bit more time than not processing the data. But
  what takes MOST time is storing and reading the data on disk. So it does
  not really matter if you already treat the data when you get it. It is
  actually a big advantage because RRDtool requires NO post-processing. All
  the other tools I know go to great length in that area. They require you
  to run cleanup procedures and consolidation functions regularly on your data
  in order to prevent it from spiraling out of control. 

* The FIRST CF might actually be interesting ... anybody using RRDtool for
  stock price stuff ?

* If you have a problem with 0.4 errors per second, think about it until
  you understand and agree then it will be much simpler to explain it to
  others.

* you take a ball of clay and drop it into a boule of water filled to the
  rim. You collect the water and measure its weight. Take the ball out
  again, form it into a boat. Fill the boule to the rim and place the boat
  into the boule. Again water spills over. Collect and weigh it. Weigh the
  ball/boat. Explain the numbers ... did you expect this ? (This is not
  related to RRDtool, just a little wrap your mind round exercise).

cheers
tobi



  this occurs less often
some 


-- 
 ______    __   _
/_  __/_  / /  (_) Oetiker, Timelord & SysMgr @ EE-Dept ETH-Zurich
 / // _ \/ _ \/ / TEL: +41(0)1-6325286  FAX:...1517  ICQ: 10419518 
/_/ \.__/_.__/_/ oetiker at ee.ethz.ch http://ee-staff.ethz.ch/~oetiker


--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-users
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the rrd-users mailing list