[rrd-users] RRDTool + libdbi + mysql + speed

Martin Sperl rrdtool at martin.sperl.org
Wed Jul 6 08:25:32 CEST 2011


I will have a look as soon as I find time...

But this is probably more of a  material for the libdbi mailinglist than for the rrd list...

Martin

On 04.07.2011, at 22:18, Adam Jacob Muller wrote:

> Sure,
> I was just being lazy not wanting to redact them, also please be aware I'm a HORRIBLE c programmer.
> 
> # cat dbi.c 
> #include <stdio.h>
> #include <dbi/dbi.h>
> #include <sys/time.h>
> 
> int main() {
>    dbi_conn conn;
>    dbi_result result;
> 
>    struct timeval query_start;
>    struct timeval query_end;
>    struct timeval query_elapsed;
>    float real_query_elapsed;
> 
>    struct timeval fetch_start;
>    struct timeval fetch_end;
>    struct timeval fetch_elapsed;
>    float real_fetch_elapsed=0;
>    int rowcount;
> 
>    dbi_initialize(NULL);
>    conn = dbi_conn_new("mysql");
>    dbi_conn_set_option(conn, "host", "something");
>    dbi_conn_set_option(conn, "username", "something");
>    dbi_conn_set_option(conn, "password", "something");
>    dbi_conn_set_option(conn, "dbname", "something");
>    dbi_conn_set_option(conn, "encoding", "UTF-8");
>    if (dbi_conn_connect(conn) < 0) {
>        printf("Could not connect. Please check the option settings\n");
>        return;
>    }
>    gettimeofday(&query_start,NULL);
>    result = dbi_conn_queryf(conn,"SELECT dtime as rrd_time, counter as rrd_value FROM ifInOctets WHERE id=1920 AND '2011-06-01 00:00:00' < dtime AND dtime < '2011-06-26 00:00:00' ORDER BY dtime");
>    gettimeofday(&query_end,NULL);
>    timersub(&query_end,&query_start,&query_elapsed);
>    real_query_elapsed=(float)query_elapsed.tv_sec;
>    real_query_elapsed+=(float)((float)query_elapsed.tv_usec/1000000);
>    printf("dbi_conn_queryf took %.4f\n",real_query_elapsed);
> 
> 
>    gettimeofday(&fetch_start,NULL);
>    rowcount=0;
>    if (result) {
> 	while (dbi_result_next_row(result)) {
> 		rowcount++;
> 	}
>    }
>    gettimeofday(&fetch_end,NULL);
>    timersub(&fetch_end,&fetch_start,&fetch_elapsed);
>    real_fetch_elapsed=(float)fetch_elapsed.tv_sec;
>    real_fetch_elapsed+=(float)((float)fetch_elapsed.tv_usec/1000000);
>    printf("dbi_result_next_row took %.4f\n",real_fetch_elapsed);
>    printf("looped through %d rows\n",rowcount);
>    sleep(30);
>    return;
> }
> 
> # cat mysql.c 
> #include <my_global.h>
> #include <mysql.h>
> 
> int main(int argc, char **argv)
> {
>  MYSQL *conn;
>  MYSQL_RES *result;
>  MYSQL_ROW row;
>  int num_fields;
>  int i;
>    struct timeval query_start;
>    struct timeval query_end;
>    struct timeval query_elapsed;
>    float real_query_elapsed;
> 
>    struct timeval fetch_start;
>    struct timeval fetch_end;
>    struct timeval fetch_elapsed;
>    float real_fetch_elapsed=0;
> 
>  printf("MySQL client version: %s\n", mysql_get_client_info());
>  conn = mysql_init(NULL);
>  if (conn == NULL) {
>  	printf("Error %u: %s\n", mysql_errno(conn), mysql_error(conn));
> 	exit(1);
>  }
> 
>  if (mysql_real_connect(conn, "this", "that", "theotherthing", "redacted", 0, NULL, 0) == NULL) {
> 	printf("Error %u: %s\n", mysql_errno(conn), mysql_error(conn));
> 	exit(1);
>  }
> 
>  gettimeofday(&query_start,NULL);
>  mysql_query(conn,"SELECT dtime as rrd_time, counter as rrd_value FROM ifInOctets WHERE id=1920 AND '2011-06-01 00:00:00' < dtime AND dtime < '2011-06-26 00:00:00' ORDER BY dtime");
>  result=mysql_use_result(conn);
>  gettimeofday(&query_end,NULL);
>  timersub(&query_end,&query_start,&query_elapsed);
>  real_query_elapsed=(float)query_elapsed.tv_sec;
>  real_query_elapsed+=(float)((float)query_elapsed.tv_usec/1000000);
>  printf("mysql_query took %.4f\n",real_query_elapsed);
> 
>  gettimeofday(&fetch_start,NULL);
>  i=0;
>  while ((row = mysql_fetch_row(result))) {
> 	i++;
>  }
>  gettimeofday(&fetch_end,NULL);
>  timersub(&fetch_end,&fetch_start,&fetch_elapsed);
>  real_fetch_elapsed=(float)fetch_elapsed.tv_sec;
>  real_fetch_elapsed+=(float)((float)fetch_elapsed.tv_usec/1000000);
>  printf("mysql_fetch_row took %.4f\n",real_fetch_elapsed);
> 
>  printf("looped through %d rows\n",i);
> }
> 
> 
> 
> # ./dbi
> dbi_conn_queryf took 0.5866
> dbi_result_next_row took 15.4186
> looped through 71891 rows
> 
> # ./mysql 
> MySQL client version: 5.1.56
> mysql_query took 0.0021
> mysql_fetch_row took 0.5584
> looped through 71891 rows
> 
> 
> Both queries are executed with warm caches.
> 
> 
> I'm curious if you were also using MySQL or something else like postgres?
> 
> -Adam
> 
> 
> 
> On Jul 4, 2011, at 4:00 PM, Martin Sperl wrote:
> 
>> Can you please share the "tests" you are doing?
>> I can not remember any such issues when I was using/developping libdbi intensively...
>> 
>> Thanks,
>> 		Martin
>> 
>> On 04.07.2011, at 16:07, Adam Jacob Muller wrote:
>> 
>>> Hello,
>>> 
>>> The issue persists even on completely memory-based tables (run only as a
>>> synthetic benchmark) its definitely not i/o-bound.
>>> 
>>> strace does not show libdbi/rrdtool doing -anything- during this period
>>> at all.
>>> 
>>> I'd also mention that my hacked-up version of rrdtool that uses the
>>> native MySQL functions completely eliminates this issue.
>>> 
>>> -Adam
>>> 
>>> On Mon, 2011-07-04 at 14:37 +0200, Martin Sperl wrote:
>>>> It may be primarily be related to the DB - the first time it is run int 
>>>> needs to read the blocks from disk
>>>> on the second run - e.g with your own script it may not need to do that 
>>>> any longer...
>>>> 
>>>> I know that this is the biggest deficiency in the libdbi approach: the 
>>>> DB is usually not optimized and needs to read lots of blocks (in your 
>>>> example worsted case 100k blocks).
>>>> 
>>>> If it is really being CPU-bound on the client, then I am at a loss...
>>>> Does strace show anything that may be of interrest?
>>>> 
>>>> Martin
>>>> 
>>>> On 04.07.2011 05:57, Adam Jacob Muller wrote:
>>>>> Hi Brandon,
>>>>> I've tested variants of the mysql version with both _use_ and _store_ with no appreciable difference.
>>>>> 
>>>>> the dbi version appears heavily CPU-bound, with minimal memory usage (my trivial test cases don't actually -keep- the data anywhere).
>>>>> 
>>>>> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
>>>>> root     27941 90.4  0.1  42556 17784 pts/2    S+   23:55   0:16 ./dbi
>>>>> 
>>>>> 
>>>>> -Adam
>>>>> 
>>>>> On Jul 3, 2011, at 10:47 PM, Brandon Phelps wrote:
>>>>> 
>>>>>> Just a wild guess here but have you checked memory usage during the
>>>>>> job?  Maybe it has something to do with libdbi using mysql_store_result
>>>>>> for such a large number of rows?  Try switching to mysql_use_result and
>>>>>> see if the problem persists?
>>>>>> 
>>>>>> Just a thought, could be way off.
>>>>>> 
>>>>>> On 7/3/2011 10:14 PM, Adam Jacob Muller wrote:
>>>>>>> Hi,
>>>>>>> I'm curious if anyone knows of any specific issues with regards to libdbi and MySQL with RRDTool. I have a specific dataset where I pull a significant number of rows from MySQL to draw graphs (think, in the 100,000's of rows). And its extremely slow.
>>>>>>> 
>>>>>>> I've specifically isolated this not to query execution time but to libdbi.
>>>>>>> 
>>>>>>> to completely isolate the issue, this is a small program I wrote that uses libdbi to read a months worth of samples (~2*60*24*30 = 86400 rows):
>>>>>>> 
>>>>>>> # ./dbi
>>>>>>> dbi_conn_queryf took 0.5319
>>>>>>> dbi_result_next_row took 15.0172
>>>>>>> looped through 71891 rows
>>>>>>> #
>>>>>>> 
>>>>>>> 
>>>>>>> This is the same thing, but using the native mysql c-bindings:
>>>>>>> # ./mysql
>>>>>>> mysql_query took 0.0021
>>>>>>> mysql_fetch_row took 0.5352
>>>>>>> looped through 71891 rows
>>>>>>> #
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>> This probably ultimately seems like a libdbi issue, but I thought i'd bring it up here because it seems to have an extraordinary impact upon rrdtool performance and i'm curious if anyone here has seen it.
>>>>>>> 
>>>>>>> 
>>>>>>> -Adam
>>>>>>> _______________________________________________
>>>>>>> rrd-users mailing list
>>>>>>> rrd-users at lists.oetiker.ch
>>>>>>> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>>>>>> _______________________________________________
>>>>>> rrd-users mailing list
>>>>>> rrd-users at lists.oetiker.ch
>>>>>> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>>>>> _______________________________________________
>>>>> rrd-users mailing list
>>>>> rrd-users at lists.oetiker.ch
>>>>> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>>>> 
>>>> _______________________________________________
>>>> rrd-users mailing list
>>>> rrd-users at lists.oetiker.ch
>>>> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>>> 
>> 
> 



More information about the rrd-users mailing list