[rrd-developers] Re: Update: PATCH: rrdtool with SQL DS using libdbi

Wed Aug 23 21:16:17 MEST 2006

Hi!
Updating 65K datasources takes as long as doing 65K inserts into the 
database - we have never measured it!
Most of the time is spent retreiving the data anyway. Also it all 
depends on the way the inserts are done - mysql allows for bulk inserts, 
that add 65K rows all at once.

Graphing depends on the time frame you are using, if you are reducing 
data or keeping all data in highest possible resolution,...
We have currently got the data for the last 6 month in original 
resolution resulting in

Here the timing for one graph as requested - for the last day inkluding 
SQL debugging+timing:
export RRDDEBUGSQL=1
time \
rrdtool graph \
t.png \
--imgformat=PNG \
--rigid \
--base=1000 \
--start=-1day \
--end=now \
--width=500 \
--height=200 \
--alt-autoscale-max \
--vertical-label=% \
DEF:sys=sql//mysql/host=servername/dbname=bb/username=bb/password=SECRET//RRDValue/UnixTimeStamp/value/RRDKeyID=1149319922203089791/:value:AVERAGE 
\
AREA:sys#FF0000:System \
DEF:user=sql//mysql/host=servername/dbname=bb/username=bb/password=SECRET//RRDValue/UnixTimeStamp/value/RRDKeyID=1149319922185114791/:value:AVERAGE 
\
STACK:user#FFFF00:User \
DEF:idle=sql//mysql/host=servername/dbname=bb/username=bb/password=SECRET//RRDValue/UnixTimeStamp/value/RRDKeyID=1149319922221064791/:value:AVERAGE 
\
STACK:idle#00FF00:Idle
[root at monitorb01 ~]# sh  test.sh
SQL: CREATE TEMPORARY TABLE tmpVals (grouped INTEGER NOT NULL,time 
INTEGER NOT NULL, value DOUBLE NOT NULL, KEY (grouped), KEY (time))
  - TIMING: 0
SQL: INSERT INTO tmpVals SELECT TRUNCATE(UnixTimeStamp/300,0) as 
grouped,UnixTimeStamp as time,value as value FROM RRDValue WHERE 
1156271400 < UnixTimeStamp AND UnixTimeStamp < 1156357800  AND 
RRDKeyID=1149319922203089791
  - TIMING: 10
SQL: SELECT 300*grouped,AVG(value) FROM tmpVals GROUP BY grouped ORDER 
BY grouped
  - TIMING: 0
SQL: DROP TEMPORARY TABLE tmpVals
  - TIMING: 0
SQL: CREATE TEMPORARY TABLE tmpVals (grouped INTEGER NOT NULL,time 
INTEGER NOT NULL, value DOUBLE NOT NULL, KEY (grouped), KEY (time))
  - TIMING: 0
SQL: INSERT INTO tmpVals SELECT TRUNCATE(UnixTimeStamp/300,0) as 
grouped,UnixTimeStamp as time,value as value FROM RRDValue WHERE 
1156271400 < UnixTimeStamp AND UnixTimeStamp < 1156357800  AND 
RRDKeyID=1149319922185114791
  - TIMING: 0
SQL: SELECT 300*grouped,AVG(value) FROM tmpVals GROUP BY grouped ORDER 
BY grouped
  - TIMING: 0
SQL: DROP TEMPORARY TABLE tmpVals
  - TIMING: 0
SQL: CREATE TEMPORARY TABLE tmpVals (grouped INTEGER NOT NULL,time 
INTEGER NOT NULL, value DOUBLE NOT NULL, KEY (grouped), KEY (time))
  - TIMING: 0
SQL: INSERT INTO tmpVals SELECT TRUNCATE(UnixTimeStamp/300,0) as 
grouped,UnixTimeStamp as time,value as value FROM RRDValue WHERE 
1156271400 < UnixTimeStamp AND UnixTimeStamp < 1156357800  AND 
RRDKeyID=1149319922221064791
  - TIMING: 0
SQL: SELECT 300*grouped,AVG(value) FROM tmpVals GROUP BY grouped ORDER 
BY grouped
  - TIMING: 0
SQL: DROP TEMPORARY TABLE tmpVals
  - TIMING: 0
597x268

real    0m10.321s
user    0m0.109s
sys     0m0.023s

Still time is totally dependent on how big your timespan is - this 
especially effects the first time a certain datasource is used, as then 
the DB has to fetch the data from the disk, which can take initially 
some time. But if you redraw the same graph again 2 seconds later - 
everything is still cached (SQL Debugging has been cut)!
# sh  test.sh
SQL: CREATE TEMPORARY TABLE tmpVals (grouped INTEGER NOT NULL,time 
INTEGER NOT NULL, value DOUBLE NOT NULL, KEY (grouped), KEY (time))
  - TIMING: 0
SQL: INSERT INTO tmpVals SELECT TRUNCATE(UnixTimeStamp/300,0) as 
grouped,UnixTimeStamp as time,value as value FROM RRDValue WHERE 
1156271400 < UnixTimeStamp AND UnixTimeStamp < 1156357800  AND 
RRDKeyID=1149319922203089791
  - TIMING: 0
...
597x268

real    0m0.321s
user    0m0.105s
sys     0m0.027s

Here the same example with 6 month graphed:
# sh  test.sh
SQL: CREATE TEMPORARY TABLE tmpVals (grouped INTEGER NOT NULL,time 
INTEGER NOT NULL, value DOUBLE NOT NULL, KEY (grouped), KEY (time))
  - TIMING: 0
SQL: INSERT INTO tmpVals SELECT TRUNCATE(UnixTimeStamp/31276,0) as 
grouped,UnixTimeStamp as time,value as value FROM RRDValue WHERE 
1140698272 < UnixTimeStamp AND UnixTimeStamp < 1156336272  AND 
RRDKeyID=1149319922203089791
  - TIMING: 478
SQL: SELECT 31276*grouped,AVG(value) FROM tmpVals GROUP BY grouped ORDER 
BY grouped
  - TIMING: 0
SQL: DROP TEMPORARY TABLE tmpVals
  - TIMING: 0
SQL: CREATE TEMPORARY TABLE tmpVals (grouped INTEGER NOT NULL,time 
INTEGER NOT NULL, value DOUBLE NOT NULL, KEY (grouped), KEY (time))
  - TIMING: 0
SQL: INSERT INTO tmpVals SELECT TRUNCATE(UnixTimeStamp/31276,0) as 
grouped,UnixTimeStamp as time,value as value FROM RRDValue WHERE 
1140698272 < UnixTimeStamp AND UnixTimeStamp < 1156336272  AND 
RRDKeyID=1149319922185114791
  - TIMING: 8
SQL: SELECT 31276*grouped,AVG(value) FROM tmpVals GROUP BY grouped ORDER 
BY grouped
  - TIMING: 0
SQL: DROP TEMPORARY TABLE tmpVals
  - TIMING: 0
SQL: CREATE TEMPORARY TABLE tmpVals (grouped INTEGER NOT NULL,time 
INTEGER NOT NULL, value DOUBLE NOT NULL, KEY (grouped), KEY (time))
  - TIMING: 0
SQL: INSERT INTO tmpVals SELECT TRUNCATE(UnixTimeStamp/31276,0) as 
grouped,UnixTimeStamp as time,value as value FROM RRDValue WHERE 
1140698272 < UnixTimeStamp AND UnixTimeStamp < 1156336272  AND 
RRDKeyID=1149319922221064791
  - TIMING: 8
SQL: SELECT 31276*grouped,AVG(value) FROM tmpVals GROUP BY grouped ORDER 
BY grouped
  - TIMING: 0
SQL: DROP TEMPORARY TABLE tmpVals
  - TIMING: 0
597x268

real    8m14.556s
user    0m0.070s
sys     0m0.020s

and a little later the same again:
# sh  test.sh
SQL: CREATE TEMPORARY TABLE tmpVals (grouped INTEGER NOT NULL,time 
INTEGER NOT NULL, value DOUBLE NOT NULL, KEY (grouped), KEY (time))
  - TIMING: 0
SQL: INSERT INTO tmpVals SELECT TRUNCATE(UnixTimeStamp/31276,0) as 
grouped,UnixTimeStamp as time,value as value FROM RRDValue WHERE 
1140698272 < UnixTimeStamp AND UnixTimeStamp < 1156336272  AND 
RRDKeyID=1149319922203089791
  - TIMING: 22
...
597x268

real    0m22.369s
user    0m0.075s
sys     0m0.017s
[root at monitorb01 ~]#

So the first is a total Cache miss, the second is much better with 1.2 
second and still using full 5 minute resolution. You can blow it up to 
50000 pixels wide and the result is:
# sh  test.sh
SQL: CREATE TEMPORARY TABLE tmpVals (grouped INTEGER NOT NULL,time 
INTEGER NOT NULL, value DOUBLE NOT NULL, KEY (grouped), KEY (time))
  - TIMING: 0
SQL: INSERT INTO tmpVals SELECT TRUNCATE(UnixTimeStamp/312,0) as 
grouped,UnixTimeStamp as time,value as value FROM RRDValue WHERE 
1140721296 < UnixTimeStamp AND UnixTimeStamp < 1156359984  AND 
RRDKeyID=1149319922203089791
  - TIMING: 0
SQL: SELECT 312*grouped,AVG(value) FROM tmpVals GROUP BY grouped ORDER 
BY grouped
  - TIMING: 0
SQL: DROP TEMPORARY TABLE tmpVals
  - TIMING: 0
SQL: CREATE TEMPORARY TABLE tmpVals (grouped INTEGER NOT NULL,time 
INTEGER NOT NULL, value DOUBLE NOT NULL, KEY (grouped), KEY (time))
  - TIMING: 0
SQL: INSERT INTO tmpVals SELECT TRUNCATE(UnixTimeStamp/312,0) as 
grouped,UnixTimeStamp as time,value as value FROM RRDValue WHERE 
1140721296 < UnixTimeStamp AND UnixTimeStamp < 1156359984  AND 
RRDKeyID=1149319922185114791
  - TIMING: 0
SQL: SELECT 312*grouped,AVG(value) FROM tmpVals GROUP BY grouped ORDER 
BY grouped
  - TIMING: 0
SQL: DROP TEMPORARY TABLE tmpVals
  - TIMING: 0
SQL: CREATE TEMPORARY TABLE tmpVals (grouped INTEGER NOT NULL,time 
INTEGER NOT NULL, value DOUBLE NOT NULL, KEY (grouped), KEY (time))
  - TIMING: 0
SQL: INSERT INTO tmpVals SELECT TRUNCATE(UnixTimeStamp/312,0) as 
grouped,UnixTimeStamp as time,value as value FROM RRDValue WHERE 
1140721296 < UnixTimeStamp AND UnixTimeStamp < 1156359984  AND 
RRDKeyID=1149319922221064791
  - TIMING: 0
SQL: SELECT 312*grouped,AVG(value) FROM tmpVals GROUP BY grouped ORDER 
BY grouped
  - TIMING: 0
SQL: DROP TEMPORARY TABLE tmpVals
  - TIMING: 0
50097x268

real    0m17.155s
user    0m15.914s
sys     0m0.445s
So most time is spent postprocessing and rendering...but having 6 month 
at 312 seconds per pixel (almost our sample-rate) is also nice... ;)

It is also totally dependent on your mysql DB server and how it is 
configured memory and index wise... The one I was using for this example 
has 2GB of memory and 500M allocated to mysql. The table in question has 
the following stats:
mysql> show table status LIKE 'RRDValue' \G
*************************** 1. row ***************************
           Name: RRDValue
         Engine: MyISAM
        Version: 10
     Row_format: Fixed
           Rows: 1189230677
 Avg_row_length: 21
    Data_length: 24973844217
Max_data_length: 5910974510923775
   Index_length: 26702842880
      Data_free: 0
 Auto_increment: NULL
    Create_time: 2006-06-05 14:14:30
    Update_time: 2006-08-23 16:56:11
     Check_time: 2006-06-05 15:49:22
      Collation: latin1_swedish_ci
       Checksum: NULL
 Create_options:
        Comment:
1 row in set (0.00 sec)

So more memory will definitely improve the situation - best would be to 
add 64GB, but that is not feasable ;(
The other approach, which has not yet been implemented at our side - but 
is already prepared for in the current rrdtool patch - is to reduce the 
data, putting the unreduced and reduced data in separate tables and then 
depending on the required resolution using the new joined table syntax 
(table1+table2+table3+...) with either the reduced one or the noreduced 
table...

This will reduce the index that is required for reading dramatically - 
my estimate is keeping 2 weeks with 5 minute resolution reduces index 
size from 26G to aprox 6G and the reduction to 4 hour averages for 
another 12 month would result again in a table of similar sizes. All 
this is much more memory friendly than the current setup with 1G rows 
and should improve performance...

But as it is usually the case - there always corners that need to be cut...
For simple solutions with a handful of data sources the file approach is 
the most efficient. But at some point the IO penalty there is to much... 
We actually started by storing data in rrd files AND in mysql in 
parallel to get the data available for other kinds of queries and then 
were looking into such a mysql-only solution, as more and more time is 
spent writing data and graphing happens very infrequently in comparison...

On the other hand, when you start having all that many data source you 
will definitely have a totally separate application for just gathering 
the data, that does all sorts of things automatically - like adding new 
data sources,... This naturally also includes a customized presentation 
layer with possibly ACLs, etc... So it is no longer "tiny" in every respect.

Our project is for a rewritten BigBrother Monitoring server that uses a 
mysql DB as its storage and also extracts lots of data from the page 
sent by the agents... And we have got several agents that provide tons 
of data for graphing... And from there came the need to have a graphing 
solution for the database... Currently we have got everything in the DB 
from webstats to fiber channel statistics...

Hope that this gives you some ideas of the performance issues that exist 
in such a scenario...

Ciao,
          Martin

P.s: You can also gain on index size by using smaller integers for 
RRDKeyID - not using bigint but mediumint should reduce the index size 
by a factor of 7/12. But as our application is planned with mysql 
multi-master replication and INSERTS on multiple masters in mind, we 
need sadly "collission free" indexes... So there is also a way for 
improvement if your application allows it...

larryjadams at comcast.net wrote:
> Martin,
>
> What is your update time for the 65k DS'?  What is the average Graph Time for a Graph with say 3 Def's?
>
> Larry
>
>  -------------- Original message ----------------------
> From: Martin Sperl <rrdtool at martin.sperl.org>
>   
>> Content-Type: text/plain; charset=ISO-8859-1; format=flowed
>> Content-Transfer-Encoding: 7bit
>> Hi!
>>
>> Here an update to the last patch, that fixes the following problems:
>>
>> * multiple tables can be used as a joined datasource for a graph (most 
>> usefull to have a separate small dynamic table and a huge static table).
>> * /derive option now gives accurate results when graphing longer 
>> timespans for MIN/MAX.
>> * workaround for mysql bug 18165 - now using <start> < timestamp and 
>> timestamp < <end> instead of BETWEEN
>> * SQL statement changes for better performance
>> * /prediction and /sigma are now synonyms and return now value, 
>> prediction and sigma in one go, which reduces strain on the database 
>> (simmilar selects are now done once and not 3 times) and is more efficient
>> * allows now for %XX escaping some of the arguments.
>>
>> This has been tested on a system with tables of more that 100M rows and 
>> 65k different datasources...
>>
>> Cheers,
>>              Martin
>>
>> P.s: do not forget to execute "autoconf", "automake" and "configure" 
>> after applying the patch to 1.2.15!
>>
>> Martin Sperl wrote:
>>     
>>> Hi!
>>>
>>> I have created a patch to rrdtool 1.2 (SVN) which allows to define a 
>>> DS from a SQL database for graphing using the libdbi library for DB 
>>> abstraction.
>>> Tobi told me to post the patch to this list, as he thinks that several 
>>> people might be interrested in it immediately.
>>> He is also thinking about adding it to rrdtool 1.3 when this patch has 
>>> been given some more testing...
>>>
>>> List of features:
>>> * db-abstration via libdbi with DB support for the following Databases 
>>> out of the box:
>>>   * mysql
>>>   * postgress
>>> * calculation of the "derivative" of a database values (so that we get 
>>> behaviour simmilar to COUNTER/DERIVE type datasources)
>>> * Prediction logic, which gives from DB data values for:
>>>   * prediction
>>>   * sigma
>>> * SQL-DS can also be used for further transformations with CDEFs
>>> ...
>>>       
>>
>> -- Attached file removed by Ecartis and put at URL below --
>> -- Type: text/plain
>> -- Size: 58k (60023 bytes)
>> -- URL : http://lists.ee.ethz.ch/p/02-rrdtool+libdbi.patch
>>
>>
>> --
>> Unsubscribe mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
>> Help        mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
>> Archive     http://lists.ee.ethz.ch/rrd-developers
>> WebAdmin    http://lists.ee.ethz.ch/lsg2.cgi
>>
>>     
>
>
>   

--
Unsubscribe mailto:rrd-developers-request at list.ee.ethz.ch?subject=unsubscribe
Help        mailto:rrd-developers-request at list.ee.ethz.ch?subject=help
Archive     http://lists.ee.ethz.ch/rrd-developers
WebAdmin    http://lists.ee.ethz.ch/lsg2.cgi