[mrtg] Re: Need accurate cpu % info using net-snmp and mrtg

Fri Mar 14 15:12:12 MET 2003

On Thu, 13 Mar 2003, Michael Cunningham wrote:
>
> Okay.. now I am beyond confused.. I used
> your script Matt (with slight mods) to query the server and
> get all the raw snmp cpu data avaliable for solaris. It seems
> that everytime I query the server I get the same results.
> I know these servers cpu is all over the map constantly.
> I tried many different servers from Solaris 2.6 to 8.
>
> Any idea whats wrong?

... snip

Mea culpa...  when I read Mark's response the other day, I realized
that I WAS missing a key piece of the puzzle.  You had the right idea
originally about needing to compensate for the period of one kernel tick.

the script that I wrote esentially produces the Average utilization since
the last reboot... definitely not what you want.  This is because it
simply divides the total ticks for all kernel modes into the number of
ticks for a particular mode accumulated since reboot.

The correct way to do this is to use reasonably small intervals, check at
each interval time, and use the delta in time and kernel ticks, just as
Mark pointed out.

Now, as far as mrtg is concerned - if you specify something like this:

    target[cpu]: ssCpuRawUser.0&ssCpuRawUser.0:public at server
    target[cpu:  Options[cpu]: growright,nopercent,noi

mrtg will grab the number of kernel ticks for user mode since the last
reboot at an interval of 5 minutes (or whatever interval you use).  Since
we are not using gauge mode. it will subtract the new reading from the
value taken 5 minutes before - giving you the delta timeticks for user
mode.  Then, this value is divided by 300 to give the average number of
ticks for the 5 second interval.  As long as the kernel timetick is 10 mS,
the resulting number is the utilization in percent.

Here's the equation that describes what I just wrote:

  user[i]   =  previous User CPU timetick value in the log
  user[i+1] =  new User CPU timetick value
  n         =  number of kernel time ticks per second
  delta_t   =  time in seconds between measurements ( default is 300)

              __                          __
              |   ( user[i+1] - user[i] )  |
              |   ---------------------    | * 100%
              |            n               |
              |__                        __|
          __________________________________________

                          delta_t

Since the vaule of n is 100 for my linux box, it cancels out & no
correction is required.  Thus, no script is required to feed values to
mrtg - simply provide the snmp OIDs directly to mrtg.

If, however, your system provides a different timetick interval, you can
compensate with a script.

the previous equation could also be written like this:

      __                    __      __                   __
      |   ( user[i+1]        |      |    user[i]          |
      |   ----------- * 100  |  -   |   ---------- * 100  |
      |        n             |      |        n            |
      |__                 __ |      |__                 __|
  ___________________________________________________________

                          delta_t

So, you could write a script that takes the number of timeticks from the
snmp variable, multiplies it by 100 and divides it by the number of
timeticks per second.  This script would then be used in mrtg.cfg instead
of the OIDs.

Remember though, that mrtg only accepts integers,
so you may need to multiply this value by 10 or 100 to get reasonable
resolution. You could then compensate for this with the Factor and
YTicsFactor config settings.

I'm REALLY sorry for the confusion I caused & thanks to Mark for
pointing out the basic error of my ways...

I'm going to delete my previous script from my ftp site to eliminate any
more confusion.

Matt

--
Unsubscribe mailto:mrtg-request at list.ee.ethz.ch?subject=unsubscribe
Archive     http://www.ee.ethz.ch/~slist/mrtg
FAQ         http://faq.mrtg.org    Homepage     http://www.mrtg.org
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi