[rrd-users] Slow collection runtimes occurring regularly
Steve Shipway
s.shipway at auckland.ac.nz
Mon May 2 00:21:36 CEST 2011
Looks like your collections are being done via MRTG, going by the structure.
You can't specify when consolidations are done, but on the whole it
shouldn't make such a difference. We don't experience anything like this
pattern on our MRTG/RRD servers.
In order to spread things out over time, there are a number of things you
can do. Using RRD 1.4.x (possibly the trunk version) allows you to use
rrdcached which has a noticeable (~20%?) performance saving; also you should
tune your use of the Forks: directive in MRTG to make sure you're
multithreading appropriately. Adding more memory to the server might also
help, if you need to increase the threads (our machines tend to be
memory-bound rather than CPU-bound, but we use many data-collection plugins)
If you don't use MRTG in daemon mode then it is less efficient; RRDTool 1.3
and 1.4 can use memory-mapping and other nice things to improve performance,
as well as MRTG caching the config files, but this works better when in
daemon mode.
Steve
_____
Steve Shipway
ITS Unix Services Design Lead
University of Auckland, New Zealand
Floor 1, 58 Symonds Street, Auckland
Phone: +64 (0)9 3737599 ext 86487
DDI: +64 (0)9 924 6487
Mobile: +64 (0)21 753 189
Email: <mailto:s.shipway at auckland.ac.nz> s.shipway at auckland.ac.nz
P Please consider the environment before printing this e-mail
From: rrd-users-bounces+s.shipway=auckland.ac.nz at lists.oetiker.ch
[mailto:rrd-users-bounces+s.shipway=auckland.ac.nz at lists.oetiker.ch] On
Behalf Of Joshua Keroes
Sent: Sunday, 1 May 2011 5:53 p.m.
To: rrd-users at lists.oetiker.ch
Subject: [rrd-users] Slow collection runtimes occurring regularly
Our collectors run long at regular intervals; in particular every two hours,
and to lesser extents every hour and half hour. Here's a graph showing how
long each collection cycle lasts on one of the collection machines:
http://i.imgur.com/xaZJ5.png - note the regular spikes.
Most RRD's consolidate every 30 minutes, 2 hours, and 24 hours; see the
bottom for a sample `rrd info`. Our current theory is that the RRD
consolidations are causing these long runtimes. If that's the case, is there
a way to evenly stagger the consolidations over time so we can better
distribute RRD update load?
Thanks,
Joshua
filename = "/rrd/router/cr01.ptleorte.integra.net/tengigabitethernet134.rrd"
rrd_version = "0003"
step = 300
last_update = 1304228713
ds[ds0].type = "COUNTER"
ds[ds0].minimal_heartbeat = 600
ds[ds0].min = 0.0000000000e+00
ds[ds0].max = 1.2500000000e+09
ds[ds0].last_ds = "1596044569532963"
ds[ds0].value = 4.0248335433e+08
ds[ds0].unknown_sec = 0
ds[ds1].type = "COUNTER"
ds[ds1].minimal_heartbeat = 600
ds[ds1].min = 0.0000000000e+00
ds[ds1].max = 1.2500000000e+09
ds[ds1].last_ds = "3460406816844600"
ds[ds1].value = 8.9596753966e+08
ds[ds1].unknown_sec = 0
rra[0].cf = "AVERAGE"
rra[0].rows = 600
rra[0].pdp_per_row = 1
rra[0].xff = 5.0000000000e-01
rra[0].cdp_prep[0].value = NaN
rra[0].cdp_prep[0].unknown_datapoints = 0
rra[0].cdp_prep[1].value = NaN
rra[0].cdp_prep[1].unknown_datapoints = 0
rra[1].cf = "AVERAGE"
rra[1].rows = 600
rra[1].pdp_per_row = 6
rra[1].xff = 5.0000000000e-01
rra[1].cdp_prep[0].value = 9.4104250250e+07
rra[1].cdp_prep[0].unknown_datapoints = 0
rra[1].cdp_prep[1].value = 2.0174889583e+08
rra[1].cdp_prep[1].unknown_datapoints = 0
rra[2].cf = "AVERAGE"
rra[2].rows = 600
rra[2].pdp_per_row = 24
rra[2].xff = 5.0000000000e-01
rra[2].cdp_prep[0].value = 6.5449761744e+08
rra[2].cdp_prep[0].unknown_datapoints = 0
rra[2].cdp_prep[1].value = 1.4734297081e+09
rra[2].cdp_prep[1].unknown_datapoints = 0
rra[3].cf = "AVERAGE"
rra[3].rows = 732
rra[3].pdp_per_row = 288
rra[3].xff = 5.0000000000e-01
rra[3].cdp_prep[0].value = 2.2692529674e+09
rra[3].cdp_prep[0].unknown_datapoints = 3
rra[3].cdp_prep[1].value = 4.7002069004e+09
rra[3].cdp_prep[1].unknown_datapoints = 3
rra[4].cf = "MAX"
rra[4].rows = 600
rra[4].pdp_per_row = 1
rra[4].xff = 5.0000000000e-01
rra[4].cdp_prep[0].value = NaN
rra[4].cdp_prep[0].unknown_datapoints = 0
rra[4].cdp_prep[1].value = NaN
rra[4].cdp_prep[1].unknown_datapoints = 0
rra[5].cf = "MAX"
rra[5].rows = 600
rra[5].pdp_per_row = 6
rra[5].xff = 5.0000000000e-01
rra[5].cdp_prep[0].value = 3.2405792329e+07
rra[5].cdp_prep[0].unknown_datapoints = 0
rra[5].cdp_prep[1].value = 6.9813629778e+07
rra[5].cdp_prep[1].unknown_datapoints = 0
rra[6].cf = "MAX"
rra[6].rows = 600
rra[6].pdp_per_row = 24
rra[6].xff = 5.0000000000e-01
rra[6].cdp_prep[0].value = 3.4089842030e+07
rra[6].cdp_prep[0].unknown_datapoints = 0
rra[6].cdp_prep[1].value = 7.6745619740e+07
rra[6].cdp_prep[1].unknown_datapoints = 0
rra[7].cf = "MAX"
rra[7].rows = 732
rra[7].pdp_per_row = 288
rra[7].xff = 5.0000000000e-01
rra[7].cdp_prep[0].value = 4.4271024386e+07
rra[7].cdp_prep[0].unknown_datapoints = 3
rra[7].cdp_prep[1].value = 8.8648080465e+07
rra[7].cdp_prep[1].unknown_datapoints = 3
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.oetiker.ch/pipermail/rrd-users/attachments/20110501/fec6b083/attachment-0001.htm
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4928 bytes
Desc: not available
Url : http://lists.oetiker.ch/pipermail/rrd-users/attachments/20110501/fec6b083/attachment-0001.bin
More information about the rrd-users
mailing list