[rrd-users] Re: Using RRDTool & SNMP to graph CPU usage

Graeme Donaldson graeme at toxicbunny.net
Tue Nov 18 18:28:32 MET 2003


Alex van den Bogaerdt said:
> On Tue, Nov 18, 2003 at 09:24:23AM +0200, Graeme Donaldson wrote:
>
>> Hi again
>>
>> I've managed to get it working quite nicely.  The only thing that
>> concerns
>> me is that CPU usage sometimes jumps over 100%.  Surely that can't
>> happen?
>
> This may have to do with timing problems.

Oops, forgot to post it to the list.  Sorry Alex ;-)

OK, I've figured it out.  The problem is that Hugo's original script (at
http://hvdkooij.xs4all.nl/stats/) isn't quite right (sorry Hugo ;-).
ssCpuRaw(User|Nice|System|Idle) are counters of some sort.  I'm not sure
what exactly they count, but that's irrelevant.  What matters is that for
any regular interval (5 minutes in this case), the sum of the deltas of
those 4 counters is always the same.  On the test machine I was using, it
was 128, which is why my graph maxed out at 128 when the CPU was under
100% load.

This is how the whole setup should be:

* The RRD is created with:
rrdtool create /var/db/rrd/wizard-cpu.rrd \
  DS:user:COUNTER:600:U:U       \
  DS:nice:COUNTER:600:U:U       \
  DS:system:COUNTER:600:U:U     \
  DS:idle:COUNTER:600:U:U       \
  RRA:AVERAGE:0.5:1:576         \
  RRA:AVERAGE:0.5:6:672         \
  RRA:AVERAGE:0.5:24:744        \
  RRA:AVERAGE:0.5:288:732       \
  RRA:MAX:0.5:1:576             \
  RRA:MAX:0.5:6:672             \
  RRA:MAX:0.5:24:744            \
  RRA:MAX:0.5:288:732

* The update script I have as follows:
echo $(snmpget -v 2c -c public -Ovq wizard ssCpuRawUser.0 ssCpuRawNice.0
ssCpuRawSystem.0 ssCpuRawIdle.0) | \
awk "{ printf(\"update /var/db/rrd/wizard-cpu.rrd N:%d:%d:%d:%d\", \$1,
\$2, \$3, \$4) }" | rrdtool -

* The graph is generated with:
rrdtool graph $webdir/$host-cpu-$period.png \
  --start now-1$period \
  --vertical-label "CPU Usage (%)" \
  --width 600 \
  --height 200 \
  --title "CPU usage for the past $period" \
  DEF:user=$rrdfile:user:AVERAGE \
  DEF:nice=$rrdfile:nice:AVERAGE \
  DEF:system=$rrdfile:system:AVERAGE \
  DEF:idle=$rrdfile:idle:AVERAGE \
  CDEF:total=user,nice,+,system,+,idle,+ \
  CDEF:userperc=user,total,/,100,* \
  CDEF:niceperc=nice,total,/,100,* \
  CDEF:systemperc=system,total,/,100,* \
  CDEF:idleperc=idle,total,/,100,* \
  CDEF:totusedperc=userperc,niceperc,+,systemperc,+ \
  AREA:userperc#EA8F00:"User" \
  GPRINT:userperc:AVERAGE:"Average\:%3.0lf%%" \
  GPRINT:userperc:MAX:"Maximum\:%3.0lf%%\n" \
  STACK:niceperc#7EE600:"Nice" \
  GPRINT:niceperc:AVERAGE:"Average\:%3.0lf%%" \
  GPRINT:niceperc:MAX:"Maximum\:%3.0lf%%\n" \
  STACK:systemperc#FF0000:"System" \
  GPRINT:systemperc:AVERAGE:"Average\:%3.0lf%%" \
  GPRINT:systemperc:MAX:"Maximum\:%3.0lf%%\n" \
  STACK:idleperc#00FFFF:"Idle" \
  GPRINT:idleperc:AVERAGE:"Average\:%3.0lf%%" \
  GPRINT:idleperc:MAX:"Maximum\:%3.0lf%%\n" \
  COMMENT:"$timestamp\r"

I've taken this out of context from my shell script, but it should be easy
enough to follow what I've done.  As you can see, I graph the percentage
of the total increase, which is constant for a regular interval.  The sum
of user, nice, system and idle is always 100%, as expected.

Comments and corrections are most welcome and would be appreciated.

Once I have it running nicely I'll make it available on the web, perhaps
with a mini-howto.

Regards,
Graeme.

--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-users
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the rrd-users mailing list