[rrd-users] rrdgraph data scaling

Wolfgang Draxinger Wolfgang.Draxinger at physik.uni-muenchen.de
Tue Dec 1 18:53:15 CET 2009


I'm trying to set up some printer statistics system using rrdtool. While data 
consolidation works, I've some problem with graphing, namely getting things 
scaled and reported correctly.

The RRD was created and is updated using Python rrdtool binding by this 


rrd = 'printerstat.rrd'

printers = [

printerhosts = {
	'printer1': 'printer1.example.com'
	'printer2': 'printer2.example.com'

rras = [
	(1, 8064),  # 4 weeks worth of 5 minutes data
	(12, 2160), # 90 days of hourly data
	(288, 366)  # ~1 year of daily data

def create_rrd():
        import rrdtool
        rrdtool.create(rrd, "--step", "300",
                *(["DS:%s:COUNTER:600:0:1000"%(p,) for p in printers]+
                ["RRA:LAST:0.5:%d:%d"%rp for rp in rras]+
                ["RRA:AVERAGE:0.5:%d:%d"%rp for rp in rras]) )

def get_pagecounter(hostname):
        from pysnmp.entity.rfc3413.oneliner import cmdgen
        errorIndication, errorStatus, errorIndex, varBinds = 
                cmdgen.CommunityData('rrdtool', 'public'),
                cmdgen.UdpTransportTarget((hostname, 161)),

        return int(varBinds[0][1])

if __name__ == '__main__':
        import sys, os, rrdtool
        pagecounters = [get_pagecounter(printerhosts[p]) for p in printers]
        upd_str = 'N:'+(':'.join(map(str, pagecounters)))
        except OSError:
                print "creating RRD"
        rrdtool.update(rrd, upd_str)

This script works flawlessly. The problem begins, when it comes to graphing 
with stepsizes other the one I created the RRD with.

Here's an example:


NOW=$(date +%s)

rrdtool graph 5min.png --start "08:00" --end "20:00" \
-w $WIDTH -h $HEIGHT --vertical-label "Seiten/h" --step 300 \
-t "stuendlich" \
--x-grid MINUTE:15:HOUR:1:HOUR:1:0:%H \
DEF:printer1=$rrd:hp4300:LAST \
DEF:printer2=$rrd:hp4300:LAST \
CDEF:printer1-hourly=printer1,300,\* \
CDEF:printer2-hourly=printer1,300,\* \
AREA:printer1#C0C000:"printer1" \
AREA:printer2#C0C000:"printer2":STACK \

rrdtool graph hourly.png --start "08:00" --end "20:00" \
-w $WIDTH -h $HEIGHT --vertical-label "Seiten/h" --step 3600 \
-t "stuendlich" \
--x-grid MINUTE:15:HOUR:1:HOUR:1:0:%H \
DEF:printer1=$rrd:hp4300:LAST \
DEF:printer2=$rrd:hp4300:LAST \
CDEF:printer1-hourly=printer1,3600,\* \
CDEF:printer2-hourly=printer1,3600,\* \
AREA:printer1#C0C000:"printer1" \
AREA:printer2#C0C000:"printer2":STACK \

So far the factor 3600 is just some kind of educated guess, and the outputed 
date looked right. However when comparing 5min.png and hourly.png I had the 
case, that for a particular time slot in the 5min.png output printer1 showed 
more printed pages than printer2, whereas in the hourly.png output it was 
printer2 which showed more pages (I also summed up the 5 minute slots when 
comparing with hourly).

My original goal was, to sum up the total numer of printed pages within each 
time step, but only for each time step. The TOTAL operator just sums up 
everthin in the graphed time rage which is not, what I want.

So what am I doing wrong?

Just for summary here's what I want: I want the number of pages printed within 
the selected timeframe, where I have 5 minute resolution for the last 4 weeks, 
hourly resolution for 3 months and daily statistics for the whole year. When 
graphing out the data, I want the total number of pages printed within each 
step, being able to choose from the RRA stepsizes.


More information about the rrd-users mailing list