[rrd-users] RRDTool + libdbi + mysql + speed

Adam Jacob Muller rrdtool-users at adam.gs
Mon Jul 4 22:18:40 CEST 2011


Sure,
I was just being lazy not wanting to redact them, also please be aware I'm a HORRIBLE c programmer.

# cat dbi.c 
#include <stdio.h>
#include <dbi/dbi.h>
#include <sys/time.h>

int main() {
    dbi_conn conn;
    dbi_result result;

    struct timeval query_start;
    struct timeval query_end;
    struct timeval query_elapsed;
    float real_query_elapsed;
    
    struct timeval fetch_start;
    struct timeval fetch_end;
    struct timeval fetch_elapsed;
    float real_fetch_elapsed=0;
    int rowcount;

    dbi_initialize(NULL);
    conn = dbi_conn_new("mysql");
    dbi_conn_set_option(conn, "host", "something");
    dbi_conn_set_option(conn, "username", "something");
    dbi_conn_set_option(conn, "password", "something");
    dbi_conn_set_option(conn, "dbname", "something");
    dbi_conn_set_option(conn, "encoding", "UTF-8");
    if (dbi_conn_connect(conn) < 0) {
        printf("Could not connect. Please check the option settings\n");
        return;
    }
    gettimeofday(&query_start,NULL);
    result = dbi_conn_queryf(conn,"SELECT dtime as rrd_time, counter as rrd_value FROM ifInOctets WHERE id=1920 AND '2011-06-01 00:00:00' < dtime AND dtime < '2011-06-26 00:00:00' ORDER BY dtime");
    gettimeofday(&query_end,NULL);
    timersub(&query_end,&query_start,&query_elapsed);
    real_query_elapsed=(float)query_elapsed.tv_sec;
    real_query_elapsed+=(float)((float)query_elapsed.tv_usec/1000000);
    printf("dbi_conn_queryf took %.4f\n",real_query_elapsed);
   
    
    gettimeofday(&fetch_start,NULL);
    rowcount=0;
    if (result) {
	while (dbi_result_next_row(result)) {
		rowcount++;
	}
    }
    gettimeofday(&fetch_end,NULL);
    timersub(&fetch_end,&fetch_start,&fetch_elapsed);
    real_fetch_elapsed=(float)fetch_elapsed.tv_sec;
    real_fetch_elapsed+=(float)((float)fetch_elapsed.tv_usec/1000000);
    printf("dbi_result_next_row took %.4f\n",real_fetch_elapsed);
    printf("looped through %d rows\n",rowcount);
    sleep(30);
    return;
}

# cat mysql.c 
#include <my_global.h>
#include <mysql.h>

int main(int argc, char **argv)
{
  MYSQL *conn;
  MYSQL_RES *result;
  MYSQL_ROW row;
  int num_fields;
  int i;
    struct timeval query_start;
    struct timeval query_end;
    struct timeval query_elapsed;
    float real_query_elapsed;

    struct timeval fetch_start;
    struct timeval fetch_end;
    struct timeval fetch_elapsed;
    float real_fetch_elapsed=0;

  printf("MySQL client version: %s\n", mysql_get_client_info());
  conn = mysql_init(NULL);
  if (conn == NULL) {
  	printf("Error %u: %s\n", mysql_errno(conn), mysql_error(conn));
	exit(1);
  }

  if (mysql_real_connect(conn, "this", "that", "theotherthing", "redacted", 0, NULL, 0) == NULL) {
	printf("Error %u: %s\n", mysql_errno(conn), mysql_error(conn));
	exit(1);
  }

  gettimeofday(&query_start,NULL);
  mysql_query(conn,"SELECT dtime as rrd_time, counter as rrd_value FROM ifInOctets WHERE id=1920 AND '2011-06-01 00:00:00' < dtime AND dtime < '2011-06-26 00:00:00' ORDER BY dtime");
  result=mysql_use_result(conn);
  gettimeofday(&query_end,NULL);
  timersub(&query_end,&query_start,&query_elapsed);
  real_query_elapsed=(float)query_elapsed.tv_sec;
  real_query_elapsed+=(float)((float)query_elapsed.tv_usec/1000000);
  printf("mysql_query took %.4f\n",real_query_elapsed);

  gettimeofday(&fetch_start,NULL);
  i=0;
  while ((row = mysql_fetch_row(result))) {
	i++;
  }
  gettimeofday(&fetch_end,NULL);
  timersub(&fetch_end,&fetch_start,&fetch_elapsed);
  real_fetch_elapsed=(float)fetch_elapsed.tv_sec;
  real_fetch_elapsed+=(float)((float)fetch_elapsed.tv_usec/1000000);
  printf("mysql_fetch_row took %.4f\n",real_fetch_elapsed);

  printf("looped through %d rows\n",i);
}



# ./dbi
dbi_conn_queryf took 0.5866
dbi_result_next_row took 15.4186
looped through 71891 rows

# ./mysql 
MySQL client version: 5.1.56
mysql_query took 0.0021
mysql_fetch_row took 0.5584
looped through 71891 rows


Both queries are executed with warm caches.


I'm curious if you were also using MySQL or something else like postgres?

-Adam



On Jul 4, 2011, at 4:00 PM, Martin Sperl wrote:

> Can you please share the "tests" you are doing?
> I can not remember any such issues when I was using/developping libdbi intensively...
> 
> Thanks,
> 		Martin
> 
> On 04.07.2011, at 16:07, Adam Jacob Muller wrote:
> 
>> Hello,
>> 
>> The issue persists even on completely memory-based tables (run only as a
>> synthetic benchmark) its definitely not i/o-bound.
>> 
>> strace does not show libdbi/rrdtool doing -anything- during this period
>> at all.
>> 
>> I'd also mention that my hacked-up version of rrdtool that uses the
>> native MySQL functions completely eliminates this issue.
>> 
>> -Adam
>> 
>> On Mon, 2011-07-04 at 14:37 +0200, Martin Sperl wrote:
>>> It may be primarily be related to the DB - the first time it is run int 
>>> needs to read the blocks from disk
>>> on the second run - e.g with your own script it may not need to do that 
>>> any longer...
>>> 
>>> I know that this is the biggest deficiency in the libdbi approach: the 
>>> DB is usually not optimized and needs to read lots of blocks (in your 
>>> example worsted case 100k blocks).
>>> 
>>> If it is really being CPU-bound on the client, then I am at a loss...
>>> Does strace show anything that may be of interrest?
>>> 
>>> Martin
>>> 
>>> On 04.07.2011 05:57, Adam Jacob Muller wrote:
>>>> Hi Brandon,
>>>> I've tested variants of the mysql version with both _use_ and _store_ with no appreciable difference.
>>>> 
>>>> the dbi version appears heavily CPU-bound, with minimal memory usage (my trivial test cases don't actually -keep- the data anywhere).
>>>> 
>>>> USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
>>>> root     27941 90.4  0.1  42556 17784 pts/2    S+   23:55   0:16 ./dbi
>>>> 
>>>> 
>>>> -Adam
>>>> 
>>>> On Jul 3, 2011, at 10:47 PM, Brandon Phelps wrote:
>>>> 
>>>>> Just a wild guess here but have you checked memory usage during the
>>>>> job?  Maybe it has something to do with libdbi using mysql_store_result
>>>>> for such a large number of rows?  Try switching to mysql_use_result and
>>>>> see if the problem persists?
>>>>> 
>>>>> Just a thought, could be way off.
>>>>> 
>>>>> On 7/3/2011 10:14 PM, Adam Jacob Muller wrote:
>>>>>> Hi,
>>>>>> I'm curious if anyone knows of any specific issues with regards to libdbi and MySQL with RRDTool. I have a specific dataset where I pull a significant number of rows from MySQL to draw graphs (think, in the 100,000's of rows). And its extremely slow.
>>>>>> 
>>>>>> I've specifically isolated this not to query execution time but to libdbi.
>>>>>> 
>>>>>> to completely isolate the issue, this is a small program I wrote that uses libdbi to read a months worth of samples (~2*60*24*30 = 86400 rows):
>>>>>> 
>>>>>> # ./dbi
>>>>>> dbi_conn_queryf took 0.5319
>>>>>> dbi_result_next_row took 15.0172
>>>>>> looped through 71891 rows
>>>>>> #
>>>>>> 
>>>>>> 
>>>>>> This is the same thing, but using the native mysql c-bindings:
>>>>>> # ./mysql
>>>>>> mysql_query took 0.0021
>>>>>> mysql_fetch_row took 0.5352
>>>>>> looped through 71891 rows
>>>>>> #
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> This probably ultimately seems like a libdbi issue, but I thought i'd bring it up here because it seems to have an extraordinary impact upon rrdtool performance and i'm curious if anyone here has seen it.
>>>>>> 
>>>>>> 
>>>>>> -Adam
>>>>>> _______________________________________________
>>>>>> rrd-users mailing list
>>>>>> rrd-users at lists.oetiker.ch
>>>>>> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>>>>> _______________________________________________
>>>>> rrd-users mailing list
>>>>> rrd-users at lists.oetiker.ch
>>>>> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>>>> _______________________________________________
>>>> rrd-users mailing list
>>>> rrd-users at lists.oetiker.ch
>>>> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>>> 
>>> _______________________________________________
>>> rrd-users mailing list
>>> rrd-users at lists.oetiker.ch
>>> https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users
>> 
> 



More information about the rrd-users mailing list