[rrd-developers] RRDCacheD - Client rewriting path

Tue Sep 29 17:25:38 CEST 2009

Sebastian Harl wrote:
> *Personally*, I'm not sure about the benefits of being able to
> transparently hide the daemon in a local setup (i.e. keeping a
> consistent behavior in regard to file names when talking to the daemon
> or accessing the files directly). The reasons for that are two-fold:
>
>  a) RRDCacheD is (mostly) for large setups which (imho) will not benefit
>     from that transparency. So the benefits would be limited to a small
>     number of users only, while (again imho) bringing along some
>     negative aspects as well (see, e.g., (b)).
>
>  b) Introducing that transparency probably will encourage mixing direct
>     and "cached" access to RRD files. This is error prone (e.g. cached
>     updates will basically be lost after a direct access using a later
>     time stamp), which is especially bad when it happens accidentally
>     (which, in turn, is not unlikely given the many, transparent ways to
>     use or skip the daemon; e.g. think about the environment variables,
>     etc.). I could perfectly understand any user who's seriously annoyed
>     by that.
>
> Then again, I'm thinking of RRDCacheD as a network daemon and, thus, it
> should imho be optimized for that. It might not be perfect yet, but it's
> already very well suited for that purpose and I'm pretty sure that's
> where we're going to.
>
> There's a lot of "imho" in the above text, so it'd be nice to hear some
> other voices on that topic but from previous discussions I got the
> impressions that there are other people thinking similar to me.
>
> I hope this could clarify my point of view.
>   
I totally concur. We use collectd and rrdtool in production, monitoring 
thousands of machines and I rather not know how many rrds. We have tens 
of servers running to collect all the data and can't wait to deploy 
rrdcached. We've tried but ran into a bug with flushing (it didn't do 
it) which we still need to reproduce cleanly. We desperately want a full 
networked daemon so we can aggregate across arbitrary monitored servers 
whose rrds end up on different rrdcached servers. Security is a 
low-priority issue because (a) the data is not all that sensitive to 
start with and (b) all this is behind firewall, plus (c) the server ids 
used by the data collection are not easy to map back to something 
meaningful. The one feature we will need is the ability to fetch data 
from rrdcached. This is necessary to aggregate across multiple rrd 
stores and also for performance. We compute alerts from the data in rrds 
so there is a constant fetch stream going on, not just when someone is 
actively looking at graphs.

Please take the above as encouragement, not as "request". I responded to 
Sebastian's request for input so I provided it. I don't have manpower at 
the moment to contribute so I would normally refrain from stating what 
I'd like to see.
Cheers and thanks for a great software!
Thorsten - CTO RightScale