[rrd-developers] rfc: later caching
Tobias Oetiker
tobi at oetiker.ch
Mon Apr 20 08:11:23 CEST 2009
Hi Kevin,
Friday kevin brintnall wrote:
> On Fri, Apr 17, 2009 at 08:38:10AM +0200, Tobias Oetiker wrote:
> > disclaimer: this is about 1.5 not 1.4 !
> >
> > [...]
> >
> > * the (big) disadvantage is that updatev does not work anymore, and
> > for larger deployments updatev is a cornerstone function in
> > driving holt winters based alerting.
>
> Tobi, I was already planning updatev support along these lines, although I
> didn't have an idea when we'd integrate it (i.e. 1.5 vs 1.4.later).
:-) since my plan is to NOT have major new features once a release
is out ...
> > * it would require the cached to read the header information of the
> > rrdfiles once and cache them internally so that it can calculate
> > the updates without accessing the disk, but since header
> > information is quite small, a decent sized machine could
> > easily keep hundreds of thousands of headers in the cache daemon.
>
> I think the best first approach would be:
>
> - on the first updatev, take an in-memory copy of the live_head and *_prep
> - no need to do it for update
> - CON: when the daemon starts up, there is heavy read load
yep, I guess that's the price ...
> - a copy of the header allows us to do input validation on update strings
> (i.e. correct ds_cnt)
>
> - split up process_arg() into:
> - parse update string, update in-memory pdp/cdp/*_prep
> - write the RRAs
>
> I thought we'd do it in stages:
>
> (1) Process the update string twice. Once when updatev received from the
> client (update in-memory copy). Once when we call rrd_update_r() with
> the update string.
>
> - minimal changes to the current rrdcached code/data structures
> - CPU overhead minimal, still large delayed IO benefit of rrdcached
> - easiest change that gives updatev support
>
> (2) Process the update string once. Cache the computed results, and write
> them out to the RRD after delay.
>
> - requires we re-think rrd_update_r() or create a function to pass
> pre-computed values into the RRD
>
> I'm guessing (1) will be significantly easier than (2), and provide almost
> the same functionality (if slightly less efficiently).. I think staging
> it like this would make the most sense.
do you see a lot of 're-use' of stuff from (1) in (2) ? Or what is
the 'sense' in staging the process, given it is done in the safety
of trunk anyway.
cheers
tobi
--
Tobi Oetiker, OETIKER+PARTNER AG, Aarweg 15 CH-4600 Olten, Switzerland
http://it.oetiker.ch tobi at oetiker.ch ++41 62 775 9902 / sb: -9900
More information about the rrd-developers
mailing list