[rrd-users] importing large dataset

Alex van den Bogaerdt alex at ergens.op.het.net
Sat Apr 26 18:34:10 CEST 2008


On Sat, Apr 26, 2008 at 10:18:12AM -0400, Mag Gam wrote:

> > > I would like to import a very large dataset lets say 100MB text file.
> > The
> > > file consists like this.
> > >
> > > Date                  Value
> > > 2/3/03 11:00       10
> > > 2/3/03 11:01       12
> > > 2/3/03 11:02       13
> > > 2/3/03 11:03       30
> > > 2/3/03 11:04       12
> > > 2/3/03 11:05       13
> > >
> > > It keeps going on. I can generate the epoch on the left side. However, I
> > am
> > > unclear how to load this into rrdtool. I have followed the tutorial and
> > I
> > > was not able to model my data properly. Can someone please show me an
> > > example for the data above?
> >
> > What do these numbers mean?
> > When were they valid?
> >
> >
> > RRDtool is not a graphing program. Use e.g. gnuplot for that.
> >
> > If these numbers are rates, you still need to know when these rates
> > were valid.  For instance: "12", does this mean a rate between
> > "2/3/03 11:00" and "2/3/03 11:01" or between
> > "2/3/03 11:01" and "2/3/03 11:02"?  A small but significant difference.
> > RRDtool uses end times (thus: updating for 2/3/03 11:01 is updating the
> > interval upto and including 2/3/03 11:01).
> >
> > Are these times in some local time or in UTC?  If they are in local time,
> > make sure to compensate for this and don't forget about daylight saving
> > (if any).
> >
> > You will need to convert each pair into <timestamp>:<value>, and then
> > give those as input to rrdtool.  This is a *very* basic operation, so
> > perhaps you need to explain the problem you encounter a bit more.

> Alex,
> 
> Thanks for getting back to me.
> 
> These are actually rates. It is a CPU utilization rate. These values are
> point in time. So, at 2/3/03 11:00      CPU utilization is at 10%
> Does that help for clarification?

Let me rephrase: a rate over time, thus
"in the time interval between 11:00 and 11:01, the rate was
<something> units per second".  This rate is thus valid between 11:00:00
and 11:00:01, between 11:00:01 and 11:00:02, ... 11:00:59 and 11:01:00

This is what RRDtool does.  It doesn't do points in time, it is not
a program which draws a nice picture out of an arbitrary set of numbers.

If you want to pretend your data was valid during a whole minute, you
are free to do so. Just setup a GAUGE data source and update the database
with each of your time/value pairs. Of course you also need to setup
one or more (probably more!) RRAs when you run 'rrdtool create'.

Make sure you understand RRAs and consolidation, have a look at my
website if the tutorial wasn't enough.

HTH and AFK.
-- 
Alex van den Bogaerdt
http://www.vandenbogaerdt.nl/rrdtool/



More information about the rrd-users mailing list