[rrd-users] If input is already in text format and I craft a perl script to parse the text format and update rrd database, what should the step and heartbeat be?

Tue Jun 10 10:18:19 CEST 2014

Steve Shipway <s.shipway at auckland.ac.nz> wrote:

> If your metrics are all coming in the same file, for the same point in time, and all being pushed into the RRD at the same time, then it makes sense to have a single RRD to hold them as in your example.  You would usually use a separate RRD if the data came separately, potentially for different times.  Then separate RRD would make sense as you may get one sample but not another, or they were sampled at differing times.

To add to that, also consider how things may change over time. Eg, suppose you are logging disk/filesystem utilisation - both in terms of data transferred, and space used/available.

It would make sense to collate all the quantities from one filesystem into a single RRD - so perhaps an RRD with bytes written, bytes read, % space used, $inodes used. But, a machine will almost certainly have multiple filesystems, and more importantly the number may change - so it would make sense to have one RRD/filesystem. Of course, there may well be more than one filesystem on a disk - so you might want to collect stats for the physical disk (probably just bytes read/written) into one RRD, and have a separate RRD for each physical disk (or array) since the number of physical disks/arrays may change (eg if you add a disk because you've run out of space).

There isn't really a right and wrong. It's perfectly OK to have lots of small RRDs with a single DS each. It's also perfectly OK to have fewer RRDs with many DSs each. It's a matter of balancing your requirements with the ability to manage the RRDs - and of course, as mentioned above, the requirement to update all DSs in a single RRD at the same time.

I tend to use a mixture.
At one extreme I have an RRD for our UPS stats with many parameters logged, and another with 508 DSs (data in and out for all 254 usable addresses in our /24 subnet) - in both cases, data is collected and graphed with custom scripts, and all the data is collected in a single operation.
At the other extreme, I have a whole bunch of RRDs with just 2 DSs (data in and out) - one per port for a bunch of switches (the data is collected and graphed with Cacti), and the data for each port is collected separately (it's the way Cacti works).