[rrd-users] importing large dataset
Alex van den Bogaerdt
alex at ergens.op.het.net
Sat Apr 26 18:34:10 CEST 2008
On Sat, Apr 26, 2008 at 10:18:12AM -0400, Mag Gam wrote:
> > > I would like to import a very large dataset lets say 100MB text file.
> > The
> > > file consists like this.
> > >
> > > Date Value
> > > 2/3/03 11:00 10
> > > 2/3/03 11:01 12
> > > 2/3/03 11:02 13
> > > 2/3/03 11:03 30
> > > 2/3/03 11:04 12
> > > 2/3/03 11:05 13
> > >
> > > It keeps going on. I can generate the epoch on the left side. However, I
> > am
> > > unclear how to load this into rrdtool. I have followed the tutorial and
> > I
> > > was not able to model my data properly. Can someone please show me an
> > > example for the data above?
> >
> > What do these numbers mean?
> > When were they valid?
> >
> >
> > RRDtool is not a graphing program. Use e.g. gnuplot for that.
> >
> > If these numbers are rates, you still need to know when these rates
> > were valid. For instance: "12", does this mean a rate between
> > "2/3/03 11:00" and "2/3/03 11:01" or between
> > "2/3/03 11:01" and "2/3/03 11:02"? A small but significant difference.
> > RRDtool uses end times (thus: updating for 2/3/03 11:01 is updating the
> > interval upto and including 2/3/03 11:01).
> >
> > Are these times in some local time or in UTC? If they are in local time,
> > make sure to compensate for this and don't forget about daylight saving
> > (if any).
> >
> > You will need to convert each pair into <timestamp>:<value>, and then
> > give those as input to rrdtool. This is a *very* basic operation, so
> > perhaps you need to explain the problem you encounter a bit more.
> Alex,
>
> Thanks for getting back to me.
>
> These are actually rates. It is a CPU utilization rate. These values are
> point in time. So, at 2/3/03 11:00 CPU utilization is at 10%
> Does that help for clarification?
Let me rephrase: a rate over time, thus
"in the time interval between 11:00 and 11:01, the rate was
<something> units per second". This rate is thus valid between 11:00:00
and 11:00:01, between 11:00:01 and 11:00:02, ... 11:00:59 and 11:01:00
This is what RRDtool does. It doesn't do points in time, it is not
a program which draws a nice picture out of an arbitrary set of numbers.
If you want to pretend your data was valid during a whole minute, you
are free to do so. Just setup a GAUGE data source and update the database
with each of your time/value pairs. Of course you also need to setup
one or more (probably more!) RRAs when you run 'rrdtool create'.
Make sure you understand RRAs and consolidation, have a look at my
website if the tutorial wasn't enough.
HTH and AFK.
--
Alex van den Bogaerdt
http://www.vandenbogaerdt.nl/rrdtool/
More information about the rrd-users
mailing list