[rrd-users] RRDTool + libdbi + mysql + speed
Adam Jacob Muller
rrdtool-users at adam.gs
Mon Jul 4 22:18:40 CEST 2011
Sure,
I was just being lazy not wanting to redact them, also please be aware I'm a HORRIBLE c programmer.
# cat dbi.c
#include <stdio.h>
#include <dbi/dbi.h>
#include <sys/time.h>
int main() {
dbi_conn conn;
dbi_result result;
struct timeval query_start;
struct timeval query_end;
struct timeval query_elapsed;
float real_query_elapsed;
struct timeval fetch_start;
struct timeval fetch_end;
struct timeval fetch_elapsed;
float real_fetch_elapsed=0;
int rowcount;
dbi_initialize(NULL);
conn = dbi_conn_new("mysql");
dbi_conn_set_option(conn, "host", "something");
dbi_conn_set_option(conn, "username", "something");
dbi_conn_set_option(conn, "password", "something");
dbi_conn_set_option(conn, "dbname", "something");
dbi_conn_set_option(conn, "encoding", "UTF-8");
if (dbi_conn_connect(conn) < 0) {
printf("Could not connect. Please check the option settings\n");
return;
}
gettimeofday(&query_start,NULL);
result = dbi_conn_queryf(conn,"SELECT dtime as rrd_time, counter as rrd_value FROM ifInOctets WHERE id=1920 AND '2011-06-01 00:00:00' < dtime AND dtime < '2011-06-26 00:00:00' ORDER BY dtime");
gettimeofday(&query_end,NULL);
timersub(&query_end,&query_start,&query_elapsed);
real_query_elapsed=(float)query_elapsed.tv_sec;
real_query_elapsed+=(float)((float)query_elapsed.tv_usec/1000000);
printf("dbi_conn_queryf took %.4f\n",real_query_elapsed);
gettimeofday(&fetch_start,NULL);
rowcount=0;
if (result) {
while (dbi_result_next_row(result)) {
rowcount++;
}
}
gettimeofday(&fetch_end,NULL);
timersub(&fetch_end,&fetch_start,&fetch_elapsed);
real_fetch_elapsed=(float)fetch_elapsed.tv_sec;
real_fetch_elapsed+=(float)((float)fetch_elapsed.tv_usec/1000000);
printf("dbi_result_next_row took %.4f\n",real_fetch_elapsed);
printf("looped through %d rows\n",rowcount);
sleep(30);
return;
}
# cat mysql.c
#include <my_global.h>
#include <mysql.h>
int main(int argc, char **argv)
{
MYSQL *conn;
MYSQL_RES *result;
MYSQL_ROW row;
int num_fields;
int i;
struct timeval query_start;
struct timeval query_end;
struct timeval query_elapsed;
float real_query_elapsed;
struct timeval fetch_start;
struct timeval fetch_end;
struct timeval fetch_elapsed;
float real_fetch_elapsed=0;
printf("MySQL client version: %s\n", mysql_get_client_info());
conn = mysql_init(NULL);
if (conn == NULL) {
printf("Error %u: %s\n", mysql_errno(conn), mysql_error(conn));
exit(1);
}
if (mysql_real_connect(conn, "this", "that", "theotherthing", "redacted", 0, NULL, 0) == NULL) {
printf("Error %u: %s\n", mysql_errno(conn), mysql_error(conn));
exit(1);
}
gettimeofday(&query_start,NULL);
mysql_query(conn,"SELECT dtime as rrd_time, counter as rrd_value FROM ifInOctets WHERE id=1920 AND '2011-06-01 00:00:00' < dtime AND dtime < '2011-06-26 00:00:00' ORDER BY dtime");
result=mysql_use_result(conn);
gettimeofday(&query_end,NULL);
timersub(&query_end,&query_start,&query_elapsed);
real_query_elapsed=(float)query_elapsed.tv_sec;
real_query_elapsed+=(float)((float)query_elapsed.tv_usec/1000000);
printf("mysql_query took %.4f\n",real_query_elapsed);
gettimeofday(&fetch_start,NULL);
i=0;
while ((row = mysql_fetch_row(result))) {
i++;
}
gettimeofday(&fetch_end,NULL);
timersub(&fetch_end,&fetch_start,&fetch_elapsed);
real_fetch_elapsed=(float)fetch_elapsed.tv_sec;
real_fetch_elapsed+=(float)((float)fetch_elapsed.tv_usec/1000000);
printf("mysql_fetch_row took %.4f\n",real_fetch_elapsed);
printf("looped through %d rows\n",i);
}
# ./dbi
dbi_conn_queryf took 0.5866
dbi_result_next_row took 15.4186
looped through 71891 rows
# ./mysql
MySQL client version: 5.1.56
mysql_query took 0.0021
mysql_fetch_row took 0.5584
looped through 71891 rows
Both queries are executed with warm caches.
I'm curious if you were also using MySQL or something else like postgres?
-Adam
On Jul 4, 2011, at 4:00 PM, Martin Sperl wrote:
> Can you please share the "tests" you are doing?
> I can not remember any such issues when I was using/developping libdbi intensively...
>
> Thanks,
> Martin
>
> On 04.07.2011, at 16:07, Adam Jacob Muller wrote:
>
>> Hello,
>>
>> The issue persists even on completely memory-based tables (run only as a
>> synthetic benchmark) its definitely not i/o-bound.
>>
>> strace does not show libdbi/rrdtool doing -anything- during this period
>> at all.
>>
>> I'd also mention that my hacked-up version of rrdtool that uses the
>> native MySQL functions completely eliminates this issue.
>>
>> -Adam
>>
>> On Mon, 2011-07-04 at 14:37 +0200, Martin Sperl wrote:
>>> It may be primarily be related to the DB - the first time it is run int
>>> needs to read the blocks from disk
>>> on the second run - e.g with your own script it may not need to do that
>>> any longer...
>>>
>>> I know that this is the biggest deficiency in the libdbi approach: the
>>> DB is usually not optimized and needs to read lots of blocks (in your
>>> example worsted case 100k blocks).
>>>
>>> If it is really being CPU-bound on the client, then I am at a loss...
>>> Does strace show anything that may be of interrest?
>>>
>>> Martin
>>>
>>> On 04.07.2011 05:57, Adam Jacob Muller wrote:
>>>> Hi Brandon,
>>>> I've tested variants of the mysql version with both _use_ and _store_ with no appreciable difference.
>>>>
>>>> the dbi version appears heavily CPU-bound, with minimal memory usage (my trivial test cases don't actually -keep- the data anywhere).
>>>>
>>>> USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
>>>> root 27941 90.4 0.1 42556 17784 pts/2 S+ 23:55 0:16 ./dbi
>>>>
>>>>
>>>> -Adam
>>>>
>>>> On Jul 3, 2011, at 10:47 PM, Brandon Phelps wrote:
>>>>
>>>>> Just a wild guess here but have you checked memory usage during the
>>>>> job? Maybe it has something to do with libdbi using mysql_store_result
>>>>> for such a large number of rows? Try switching to mysql_use_result and
>>>>> see if the problem persists?
>>>>>
>>>>> Just a thought, could be way off.
>>>>>
>>>>> On 7/3/2011 10:14 PM, Adam Jacob Muller wrote:
>>>>>> Hi,
>>>>>> I'm curious if anyone knows of any specific issues with regards to libdbi and MySQL with RRDTool. I have a specific dataset where I pull a significant number of rows from MySQL to draw graphs (think, in the 100,000's of rows). And its extremely slow.
>>>>>>
>>>>>> I've specifically isolated this not to query execution time but to libdbi.
>>>>>>
>>>>>> to completely isolate the issue, this is a small program I wrote that uses libdbi to read a months worth of samples (~2*60*24*30 = 86400 rows):
>>>>>>
>>>>>> # ./dbi
>>>>>> dbi_conn_queryf took 0.5319
>>>>>> dbi_result_next_row took 15.0172
>>>>>> looped through 71891 rows
>>>>>> #
>>>>>>
>>>>>>
>>>>>> This is the same thing, but using the native mysql c-bindings:
>>>>>> # ./mysql
>>>>>> mysql_query took 0.0021
>>>>>> mysql_fetch_row took 0.5352
>>>>>> looped through 71891 rows
>>>>>> #
>>>>>>
>>>>>>
>>>>>>
>>>>>> This probably ultimately seems like a libdbi issue, but I thought i'd bring it up here because it seems to have an extraordinary impact upon rrdtool performance and i'm curious if anyone here has seen it.
>>>>>>
>>>>>>
>>>>>> -Adam
>>>>>> _______________________________________________
>>>>>> rrd-users mailing list
>>>>>> rrd-users at lists.oetiker.ch
>>>>>> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>>>>> _______________________________________________
>>>>> rrd-users mailing list
>>>>> rrd-users at lists.oetiker.ch
>>>>> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>>>> _______________________________________________
>>>> rrd-users mailing list
>>>> rrd-users at lists.oetiker.ch
>>>> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>>>
>>> _______________________________________________
>>> rrd-users mailing list
>>> rrd-users at lists.oetiker.ch
>>> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>>
>
More information about the rrd-users
mailing list