[rrd-users] Re: adding many rows with rrdtool resize causing problems.... HELP!

Kempf, Reed rkempf at rightnow.com
Wed Jan 8 17:46:52 MET 2003


Chris,

allright, now you've got me thinking.  How do I tell rrdtool to collect from
the yearly RRA's when my RRA's are simply AVERAGE, MIN, and MAX.  I just
tell the rrdgraph to fetch me all the rows from current time to the
epoch_time from 1 year ago.  All I know is that when I had approx. 26000
rows in my rrd, I could only view 3 months worth of data, even though I had
been collecting data for 6 months.  After I added approx. 78000 more rows to
the rrd which then totaled approx. 105000 rows, the data from before 3
months ago stopped dropping off and now it is added the new rows to the rrd.
I didn't change anything in my graphing function so I can assume that I must
be using the yearly RRA's.

everything is ok but I'm kind of confused on the subject.

ReedK

here's my scripts.  sorry but the graphing one is a bit lengthy for email
but.....

###### BASH_WRAPPER FOR RESIZING AND CALLING SED
############################
#! /bin/bash

RRD="`basename \"$1\" | awk -F . '{print $1}'`"
TEMP_RRD="tmp-${RRD}"
OUT_RRD="out-${RRD}"

echo "Working on rrdtool database: ${RRD}.rrd"
echo "RRD = ${RRD}"
echo "TEMP_RRD = ${TEMP_RRD}"
echo "OUT_RRD = ${OUT_RRD}"
exit

# move rrd db to tmp file
mv ${RRD}.rrd ${TEMP_RRD}.rrd
# dump out rrd db to xml format
rrdtool dump ${TEMP_RRD}.rrd > ${TEMP_RRD}.xml

# run sed script to add ds and cdp
./band_bytes_add.sed ${TEMP_RRD}.xml > ${OUT_RRD}.xml

# restore the xml to rrd format with bandwidth_bytes
rrdtool restore ${OUT_RRD}.xml ${TEMP_RRD}.rrd

# use rrdtool to add 1 year of rows onto RRA's
rrdtool resize ${TEMP_RRD}.rrd 0 GROW 78840
mv resize.rrd ${TEMP_RRD}.rrd
rrdtool resize ${TEMP_RRD}.rrd 1 GROW 78840
mv resize.rrd ${TEMP_RRD}.rrd

# cleanup temporary files
rm ${OUT_RRD}.xml ${TEMP_RRD}.xml

# move temp file back to original file
mv ${TEMP_RRD}.rrd ${RRD}.rrd

echo "Finished working on rrdtool database: ${RRD}.rrd"

#done

############# SED XML SCRIPT #######################
#! /bin/sed -f

26466,52767{
	/.*/d
}

162i\
\
        <ds>\
                <name> bandwidth_bytes </name>\
                <type> GAUGE </type>\
                <minimal_heartbeat> 600 </minimal_heartbeat>\
                <min> 0.0000000000e+00 </min>\
                <max> NaN </max>\
\
                <!-- PDP Status -->\
                <last_ds> UNKN </last_ds>\
                <value> 0.0000000000e+00 </value>\
                <unknown_sec> 0 </unknown_sec>\
        </ds>

182i\
			<ds><value> NaN </value>  <unknown_datapoints> 0
</unknown_datapoints></ds>	

52785i\
			<ds><value> NaN </value>  <unknown_datapoints> 0
</unknown_datapoints></ds>	


s/<\/row>/<v> NaN <\/v><\/row>/g

################ PYTHON GRAPHING SCRIPT ###########################

#!/usr/bin/env python
# $Id: $
# library of graphing functions used by rrdtool

# IMPORTS ===================================================

import RRDtool
import cgi
import sys
import os
import time

# hms-provided modules:
import hmsconstants
sys.path.append(os.path.join(hmsconstants.HMS_HOME,'common','lib'))

# sitemon-provided modules:
sys.path.append(os.path.join('../etc'))
sys.path.append(os.path.join('../lib'))
import sitemon_constants

rrd_db_loc = sitemon_constants.rrd_db_loc

# -----------------------------------------------------------

def get_times():

    time_list = []

    # Get the starting time of the current day
    curstrdate = time.strftime("%Y/%m/%d %H:%M:%S")
    time_tuple = time.strptime(curstrdate, "%Y/%m/%d %H:%M:%S")

    for i in time_tuple:
        time_list.append(i)

    if time.daylight == 1:
        time_list[8] = 1
    else:
        time_list[8] = 0

    cur_time = int(time.mktime(time_list))

    return cur_time

# -----------------------------------------------------------
def graph_rrd( rrd=None, 
               view_id=None, 
               view_name=None, 
               view=None, 
               data_src=None,
               start_epoch=None):

    # Get current end_time
    trend_end = get_times()
    range_end = start_epoch + 86400

    if view == 'day':
        start_time = 86400 # seconds per day
    elif view == 'week':
        start_time = 604800 # seconds per week
    elif view == 'month':
        start_time = 2592000 # seconds per month
    elif view == 'year':
        start_time = 31536000 # seconds per year

    loc = rrd_db_loc + '/%s.rrd' % view_id

    if start_epoch == 0:
        start = '-s -%s' % start_time
        end = '-e %d' % trend_end
    else:
        start = '-s %s' % start_epoch
	end = '-e %s' % range_end
    
    title = '-t %s' % view_name
    vert_label = '-v load_time (secs)'
    img_info = """-f '<IMG SRC="img/%s" WIDTH="%lu" HEIGHT="%lu"
ALT="Day">"""
    # rrdtool load info
    avg_admin = str('DEF:admin=%s:adm_avg_load:AVERAGE') % loc
    avg_enduser = str('DEF:enduser=%s:end_avg_load:AVERAGE') % loc
    max_admin = str('DEF:max_admin=%s:adm_max_load:MAX') % loc
    max_enduser = str('DEF:max_enduser=%s:end_max_load:MAX') % loc
    adm_threshold = str('CDEF:adm_slow=max_admin,19.99,GT,max_admin,0,IF')
    end_threshold =
str('CDEF:end_slow=max_enduser,19.99,GT,max_enduser,0,IF')
    avg_admin_line = str('LINE1:admin#00FF00:admin load')
    avg_enduser_line = str('LINE1:enduser#0000FF:enduser load')
    max_admin_line = str('LINE1:max_admin#8E2323:admin max')
    max_enduser_line = str('LINE1:max_enduser#2F2F4F:enduser max\l')
    adm_over_threshold = str('LINE1:adm_slow#32CD32:slow admin page')
    end_over_threshold = str('LINE1:end_slow#00CCFF:slow enduser page')
    hrule = str('HRULE:20#FF69B4:slow page threshold')
    space_com = str('COMMENT:\s')
    adm_com = str('COMMENT:Admin-page')
    adm_avg_com = str('COMMENT:Average:')
    adm_avg_gprint = str('GPRINT:admin:AVERAGE:%6.3lf%s')
    adm_max_com = str('COMMENT:Max:')
    adm_max_gprint = str('GPRINT:admin:MAX:%6.3lf%s\l')
    end_com = str('COMMENT:Enduser-page')
    end_avg_com = str('COMMENT:Average')
    end_avg_gprint = str('GPRINT:enduser:AVERAGE:%6.3lf%s')
    end_max_com = str('COMMENT:Max:')
    end_max_gprint = str('GPRINT:enduser:MAX:%6.3lf%s\l')
    upper_limit = '-u 20'
    rigid_val = '-r'
    # rrdtool page_turn info
    page_turn_label = '-v page_turns/5 min'
    adm_page_turns = str('DEF:adm=%s:adm_hits:AVERAGE') % loc
    end_page_turns = str('DEF:end=%s:end_hits:AVERAGE') % loc
    adm_page_turns_line = str('LINE1:adm#00FF00:admin page turns')
    end_page_turns_line = str('LINE1:end#0000FF:enduser page turns\l')
    adm_pt_com = str('COMMENT:Admin-page:')
    adm_pt_avg_com = str('COMMENT:Average:')
    adm_pt_avg_gprint = str('GPRINT:adm:AVERAGE:%6.3lf%s')
    adm_pt_max_com = str('COMMENT:Max:')
    adm_pt_max_gprint = str('GPRINT:adm:MAX:%6.3lf%s\l')
    end_pt_com = str('COMMENT:Enduser-page:')
    end_pt_avg_com = str('COMMENT:Average:')
    end_pt_avg_gprint =str('GPRINT:end:AVERAGE:%6.3lf%s')
    end_pt_max_com = str('COMMENT:Max:')
    end_pt_max_gprint = str('GPRINT:end:MAX:%6.3lf%s\l')

    if data_src == 'load':
        rrdtuple = ("-",
                    start,
                    end,
                    vert_label,
                    img_info,
                    avg_admin,
                    avg_enduser,
                    max_admin,
                    max_enduser,
                    adm_threshold,
                    end_threshold,
                    #adm_over_threshold,
                    #end_over_threshold,
                    avg_admin_line,
                    avg_enduser_line,
                    #hrule,
                    space_com,
                    space_com,
                    adm_com,
                    adm_avg_com,
                    adm_avg_gprint,
                    adm_max_com,
                    adm_max_gprint,
                    end_com,
                    end_avg_com,
                    end_avg_gprint,
                    end_max_com,
                    end_max_gprint,
                    upper_limit,
                    rigid_val)
    elif data_src == 'page_turns':
        rrdtuple = ("-",
                    start,
                    end,
                    page_turn_label,
                    img_info,
                    adm_page_turns,
                    end_page_turns,
                    adm_page_turns_line,
                    end_page_turns_line,
                    space_com,
                    adm_pt_com,
                    adm_pt_avg_com,
                    adm_pt_avg_gprint,
                    adm_pt_max_com,
                    adm_pt_max_gprint,
                    end_pt_com,
                    end_pt_avg_com,
                    end_pt_avg_gprint,
                    end_pt_max_com,
                    end_pt_max_gprint,
                    space_com,
                    space_com)
    elif data_src == 'slow':
        rrdtuple = ("-",
                    start,
                    end,
                    vert_label,
                    img_info,
                    max_admin,
                    max_enduser,
                    adm_threshold,
                    end_threshold,
                    adm_over_threshold,
                    end_over_threshold)

    img = rrd.graph(rrdtuple)

# -----------------------------------------------------------

def main():

    # Get QUERY_STRING variable passed from apache and parse it
    # The results will be returned into a dictionary.  The intid,
    # intname, view, and ds where passed from sitemon-control-int.py
    # in an img_src tag and append onto the URL.

    qs = os.getenv("QUERY_STRING")

    int_vars = cgi.parse_qs(qs)

    view_id = int_vars['view_id'][0]
    view_name = int_vars['view_name'][0]
    view = int_vars['view'][0]
    data_src = int_vars['ds'][0]
    start_epoch = int(int_vars['start'][0])

    rrd = RRDtool.RRDtool()

    print "Expires: Mon, 26 Jul 1997 05:00:00 GMT"
    print "Cache-Control: no-cache, must-revalidate"
    print "Pragma: no-cache"
    print "Content-Type: image/png"
    print ""
    graph_rrd(rrd, view_id, view_name, view, data_src, start_epoch)
    

# MAIN ======================================================

if __name__ == '__main__':
    main()


-----Original Message-----
From: Chris Robb [mailto:chrobb at indiana.edu]
Sent: Monday, January 06, 2003 2:39 PM
To: Kempf, Reed
Subject: Re: [rrd-users] Re: adding many rows with rrdtool resize
causing problems.... HELP!


Are you certain that the graphs are collecting from your yearly RRAs? If
there's five minute data over the entire period you're looking at, rrdtool
will still make a graph, but it will process all the 5-minute samples in
that time period. The best way to tell is the look at the time period just
before where your five minute samples begin. For this to work, you have to
have some other data there, from another RRA. So, it'll look something
like this:

            X                    5555555555555555555555555
Time ----------------------------------------------------->
      30        30        30        30        30         30

(NOTE: this isn't drawn to scale)

If you look at the time period above, you should only see the 5 minute
RRAs and not the 30 minute ones, so you'll have a blank graph up until the
point where the 5-minute RRAs start.

In addition, if you were to look at the time period ending on X, you
wouldn't get any data since X does not land on a 30 minute boundary.
RRDtool will choose to look at 5-minute data, won't get any, and will
interpret it as empty.

Also, can you share your scripts with the rrd-users list?

-Chris

Chris Robb
Indiana University Global NOC
Abilene/TransPAC Network Engineer
chrobb at iu.edu  Desk: 812-855-8604 Cell: 812-325-8199

On Mon, 6 Jan 2003, Kempf, Reed wrote:

> well, I didn't seem to have the same kind of problems that I was reading
> about on the list.  I used a bash and sed script to dump out the rrd into
> xml format and then used sed to add another ds.  That worked smoothly,
then
> I used rrdtool resize command to add 78,000+ rows to the rrd which was
> adding 9 months of data into my rrd.  I left my graphing function alone
> which graphed on a daily, weekly, monthly and yearly basis basically using
> the start -s <epoch_time> and end -e <epoch_time> syntax.
>
> I designed the program to only insert on 5 minute intervals which is how
my
> rrd's are set up.  I collect data for 5 minutes and determine the current
> interval epoch time, then do an insert into the rrd with this value so I
am
> not inserting values constantly throughout the interval.  This might have
> something to do with it.
>
> If anybody wants to see the specifics of my bash, sed or graphing scripts,
I
> would be happy to share them.  I thankfully didn't encounter any problems
> with this method.
>
> BTW - this program parses an apache_web_log file and calcutes bytes_sent,
> load_times and error_codes and graphs trend data on these data sources.
>
> Regards,
>
> ReedK
>
> -----Original Message-----
> From: Alex van den Bogaerdt [mailto:alex at ergens.op.het.net]
> Sent: Thursday, December 26, 2002 3:20 PM
> To: rrd-users at list.ee.ethz.ch
> Subject: [rrd-users] Re: adding many rows with rrdtool resize causing
> problems.... HELP!
>
>
>
> On Thu, Dec 26, 2002 at 01:35:51PM -0700, Kempf, Reed wrote:
>
> > The issue is that I am trying to resize about 2000 rrdtool databases
> because
> > my requirements changed from keeping 3 months of data to keeping a years
> > worth of data and I would really like to do this maintenance once
instead
> of
> > once per week.  The databases are continuously being written to and it
> would
> > increase the possibilities of data loss if I had to do this every week
not
> > to mention the maintenance headache of adding rows to 2000 databases
every
> > week.
> >
> > If anyone can enlighten me on why the "yearly" graph will look odd if I
> add
> > too many rows would be appreciated.
>
> If you add rows to the 5-minute RRA these rows will contain NaN values.
> If you add enough rows to satisfy a request for a complete year, you
> will fetch the 5-minute rows and thus see unknown data.  You expect to
> see known data thus RRDtool doesn't seem to match your expectations.
>
> This can be circumvented:  If you ask for "--end hh:mm" "--start
> end-365days"
> where "hh:mm" is midnight UTC represented in your local time, RRDtool
> will use the original 1-day RRA and thus will not use the freshly added
> 5-minute rows.
>
> In general: RRDtool will try to match your request as close as possible.
>
> For MRTG-compatible RRDs this means RRDtool won't use the "daily" RRA
> to graph a full year.  It cannot use the "weekly" or "monthly" RRAs
> either so the only choice is to use the "yearly" RRA.
>
> If you alter the amount of rows in the "daily" RRA to, for instance,
> 120000 rows (more than a year) RRDtool has to choose between the
> "daily" RRA and the "yearly" RRA as both contain enough rows to
> satisfy your request.  RRDtool will then look at the start and end
> time of the data to fetch.  The "yearly" RRA contains rows that start
> and end on 00:00 UTC.  The "daily" RRA contains rows that start and
> end on multiples of 5 minutes.
>
> If you request "now" it is highly likely the "daily" RRA will be
> selected.  If you request n*86400 (for example: 1040947200) as the
> end time and "--start end-400days" for the start time, RRDtool will
> select the "yearly" RRA.
>
> --
> Much of what looks like rudeness in hacker circles is not intended to give
> offence. Rather, it's the product of the direct, cut-through-the-bullshit
> communications style that is natural to people who are more concerned
about
> solving problems than making others feel warm and fuzzy.
>
> http://www.tuxedo.org/~esr/faqs/smart-questions.html
>
> --
> Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
> Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
> Archive     http://www.ee.ethz.ch/~slist/rrd-users
> WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi
>
>
> --
> Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
> Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
> Archive     http://www.ee.ethz.ch/~slist/rrd-users
> WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi
>
>
>


--
Unsubscribe mailto:rrd-users-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-users-request at list.ee.ethz.ch?subject=help
Archive     http://www.ee.ethz.ch/~slist/rrd-users
WebAdmin    http://www.ee.ethz.ch/~slist/lsg2.cgi



More information about the rrd-users mailing list