[rrd-developers] regarding rrdcached

Mon Aug 15 00:42:01 CEST 2011

comments ?

From: rrdtool-fb5e9517 at hashmail.org
To: tobi at oetiker.ch
Date: Sun, 14 Aug 2011 19:49:59 +0200
X-Spam-Status: No, score=0.0 required=5.0 tests=BAYES_50 autolearn=ham
    version=3.2.4
Subject: regarding rrdcached


good day, sir.


so, i got tired of rrdcached corrupting my rrd files.
and fixing threading race conditions was not on my plan for this sunday.
this leads to cranky.

so, i have to warn you, there is angry ranting in the rest of this mail.
if you dont want to read me being cranky, just look at the patch. :P


two separate issues.
context is using rrdcache under nagiosgrapher.


1. "concurrent flushes". 'nuf said.
nagios grapher issues a flush for each loaded graph.
which means short bursts of 5 flush commands for the same rrd every time a page is loaded.
now combine that with an insanely high update rate on the same rrd.

lets say the first flush finds 10000 outstanding values to be written for the rrd.
the second flush arrives before the first is completed easily. but there are already new outstanding values for the rrd.
it actualy arrives sometimes before the first one even really went off the ground, probably was still pondering the ginormous values list.

i have seen both simple skips (a short second flush completes its short updates before the ginormous first one is doing its staleness check, leading to a discard of the first-started-but-slower flush) and outright data corruption (data landing in slots were none was ever supposed to be, data being written to wrong slots within valid range).

the patch tries to work around this by introducing a per-cacheitem rrdlock that is held during the relevant section of a flush.

the enqueue-tail on rrdlock fail is commented out because it leads to busy-waiting in case there are no unlocked items on the queue.


2. "first flush" failing on batch resume
fresh rrdcache process, fresh collect2.pl.
the first perfdata batch being loaded contains some datapoints
that already got stored since collect2 went down in the mid of the batch.
usualy no real problem, the batch will be replayed in full, and any
rdd updates that are dupes will be ignored/fail.
obviously, rrdcached batching updated changes the situation a bit.
usualy, rrdcache would reject the update from client on a tw warp.
but if the rrdcached is new too, it doesnt know the last-ts, so it
will accept any. and then fail hard on the first flush, and discard
the whole flushed update batch.
which can be quite a bit of data going down the drain if cache+grapher
were down for a while.

the patch tries to work around it by getting the last_update stamp
from the rrd file when creating the cache_item.

one change worth considering would be to move the acquisition of the last_update
to _after_ the point where the ci is "guaranteed" to be "the one", and under rrdlock.


patch is against your svn trunk.
rephrase the patch as needed, C is not my native language.


really, sundays did not use to be like this.


signing of,
   cranky internet guy

-------------- next part --------------
Index: src/rrd_daemon.c
===================================================================

--- src/rrd_daemon.c	(revision 2197)
+++ src/rrd_daemon.c	(working copy)
@@ -193,6 +193,7 @@
 #define CI_FLAGS_IN_QUEUE (1<<1)
   int flags;
   pthread_cond_t  flushed;
+  pthread_mutex_t rrdlock;
   cache_item_t *prev;
   cache_item_t *next;
 };
@@ -933,6 +934,16 @@
 
     ci = cache_queue_head;
 
+    if (pthread_mutex_trylock(&ci->rrdlock)) {
+      RRDD_LOG (LOG_ERR, "queue_thread_main: "
+            "YO DAWG, I HERD U LIEK RAECOCNDITIONS (%s)",
+            ci->file);
+      //leads to busy waiting when there are no unlocked cacheitems queued
+      //enqueue_cache_item (ci, TAIL);
+      pthread_mutex_unlock (&cache_lock);
+      continue;
+    }
+
     /* copy the relevant parts */
     file = strdup (ci->file);
     if (file == NULL)
@@ -966,6 +977,7 @@
     /* Search again in the tree.  It's possible someone issued a "FORGET"
      * while we were writing the update values. */
     pthread_mutex_lock(&cache_lock);
+    pthread_mutex_unlock(&ci->rrdlock);
     ci = (cache_item_t *) g_tree_lookup(cache_tree, file);
     if (ci)
       pthread_cond_broadcast(&ci->flushed);
@@ -1396,7 +1408,29 @@
     wipe_ci_values(ci, now);
     ci->flags = CI_FLAGS_IN_TREE;
     pthread_cond_init(&ci->flushed, NULL);
+    pthread_mutex_init(&ci->rrdlock, NULL);
 
+
+    rrd_t     rrd;
+    rrd_file_t *rrd_file;
+
+    rrd_init(&rrd);
+    rrd_file = rrd_open(ci->file, &rrd, RRD_READONLY);
+    if (rrd_file == NULL) {
+        rrd_free(&rrd);
+        RRDD_LOG (LOG_ERR, "handle_request_update: "
+            "YO DAWG, I HER ... something went wrong (%s)",
+            ci->file);
+    } else {
+        ci->last_update_stamp = rrd.live_head->last_up;
+        RRDD_LOG (LOG_ERR, "handle_request_update: "
+            "YO DAWG, I HERD U LIEK DISCARDING DATA (%s)",
+            ci->file);
+        rrd_close (rrd_file);
+        rrd_free (&rrd);
+    }
+
+
     pthread_mutex_lock(&cache_lock);
 
     /* another UPDATE might have added this entry in the meantime */