[rrd-users] Number of requests and size of responses for last ten minutes?

Jack Bates 7tx7id at nottheoilrig.com
Sat Jun 16 09:38:51 CEST 2012


I have a Python script that's notified every time our caching proxy 
(Apache Traffic Server) handles a request and this notification includes 
the size of the response. Every time it's notified, I want this script 
to print the total number of requests and total size of responses 
handled in the last ten minutes, and I think that RRDtool can help with 
this?

I want the amount of memory/storage needed to measure number of 
requests/size of responses in the last ten minutes to be constant, not 
dependent on how busy the proxy is, so my rough idea how it should work 
is that the script stores total number of requests and total size of 
responses for each of a constant number of periods, say ten one minute 
periods. The script also stores total number of requests and total size 
of responses for the whole ten minute period. Every ten minutes it 
overwrites the oldest one minute period totals and when it does, it 
subtracts this period from the ten minute period totals

I think RRDtool can be configured to do basically what I describe, with 
a COUNTER data source and AVERAGE consolidation function?

I create a new database with the Python bindings. It should measure a 
value every one minute (-s 60). This is a "primary data point"? It 
should store one "consolidated data point" for each primary data point, 
and it should store a total of ten data points (RRA:AVERAGE:0.5:1:10):

 >          rrdtool.create('traffic.rrd', '-s 60',
 >            'DS:count:COUNTER:600:0:U',
 >            'DS:bytes:COUNTER:600:0:U',
 >            'RRA:AVERAGE:0.5:1:10')

Now my Python script must maintain simple counters for the total number 
of requests and total size of responses, but RRDtool will handle the 
case that these counters overflow and wrap around, or that the script 
restarts and these counters are reset. Every time my script is notified 
by our caching proxy, I update RRDtool like so:

 >  rrdtool.update('traffic.rrd', 'N:{}:{}'.format(totalCount, totalBytes))

Now I gather that RRDtool doesn't store absolute amounts, it stores 
rates like requests per second or bytes per second. So I should be able 
to get the average requests per second and bytes per second for the last 
ten minutes, and multiply by 600 seconds?

 >  print rrdtool.fetch('traffic.rrd', 'AVERAGE', '-r 600')

I expect this to print two values: Average requests per second and bytes 
per second. At most it should print twenty values, since this database 
stores only ten data points for each "count" and "bytes" data source. 
Instead this prints many more values:

 >  ((1339743540, 1339830000, 60), ('count', 'bytes'), [(None, None), 
(None, None), (None, None), ...

So, can anyone please help me spot the problem in my understanding of 
how RRDtool works? Can RRDtool help print the total number of requests 
and total size of responses handled in the last ten minutes? Why does 
RRDtool as I used it not print the average requests per second and bytes 
per second for the last ten minutes?



More information about the rrd-users mailing list