<html><body><div style="color:#000; background-color:#fff; font-family:arial, helvetica, sans-serif;font-size:10pt"><div><span><br></span></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;"><span>Hi Mikel,</span></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;"><span><br></span></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;"><span>I've personally never found a good reason to store more than one datasource per RRD datafile; and run -very large- rrdtool data servers ( multi-millions per server - many servers .)</span></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style:
normal;"><span><br></span></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;"><span>There are far too many edge-cases, latency issues and join overhead in trying to consolidate datasources into a single datafile. Yes, rrdtool itself is more efficient with an insert like that but: 1) what if the datapoints are collected at different times? 2) what if they are different steps? 3) what if you want to add a datasource? 4) what if you simply have too many datasources to try and order/consolidate from a queue to the datafile? There is also non-trivial complexity, overhead and index'ing into an rrd datafile for specific datasources. </span></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;"><span><br></span></div><div style="color:
rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent;"><span>Linux is extremely efficient at block updates, caching, open/closes, etc ... rrdtool on a low-end ( 4 cpu ) server with limited memory can easily store 160 thousand datasources per minute - on a better server, a <span style="font-style: italic; font-weight: bold;">whole lot</span> more than that.</span></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;"><span><br></span></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;"><span>'Distributed Cluster' isn't a good reason to not send all your time-series data to one server or small set of servers. The latency/request-time incurred in having to fetch data from those servers is usually not worth the
trade off.</span></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;"><span><br></span></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;">Graphs of many hundreds of datasources computed for multi-day/week time-ranges in the result set are generated in 10s of milliseconds; not seconds ... rrdtool is quite capable of producing on-demand graphs of hundreds of graphs per second from one server.</div><div><br></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;">I suggest you write a little test-script to write out rrd data to individual rrd datafiles to see 'how quick' your servers are at it. There is some OS tuning and rrdtool RRA sizing that will help;
especially don't keep hour or daily rollups ... the server has to hold onto those blocks to make the consolidation quick and not incur a read from disk.</div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;"><br></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;">rrdtool scales rather simply ( and without rrdcached -- as I don't use that either. )</div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;"><br></div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;">HTH</div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif;
background-color: transparent; font-style: normal;">-Ryan</div><div style="color: rgb(0, 0, 0); font-size: 13px; font-family: arial, helvetica, sans-serif; background-color: transparent; font-style: normal;"><br></div> <div style="font-family: arial, helvetica, sans-serif; font-size: 10pt;"> <div style="font-family: 'times new roman', 'new york', times, serif; font-size: 12pt;"> <div dir="ltr"> <hr size="1"> <font size="2" face="Arial"> <b><span style="font-weight:bold;">From:</span></b> mikel <infoeuskadi@gmail.com><br> <b><span style="font-weight: bold;">To:</span></b> rrd-users@lists.oetiker.ch <br> <b><span style="font-weight: bold;">Sent:</span></b> Saturday, April 20, 2013 4:48 AM<br> <b><span style="font-weight: bold;">Subject:</span></b> Re: [rrd-users] [unsure] max DS per rrd file<br> </font> </div> <div class="y_msg_container"><br><br>Thanks for your fast reply again.<br><br>>Maybe I don't understand what you say here. Some
metrics, or all metrics<br>are <br>>queried? Both statements cannot be true at the same time?<br><br>Yes it is a tricky case. Apologies I was not clear enough.<br><br>In most cases all metrics are queried at the same time, because we want to<br>know what value they had at a given time. And classify them.<br><br>Very randomly we would query for just one metric.<br><br>>Anyway, if you query only once in a while, maybe you should think about <br>>reducing the number of RRAs in each RRD, and just let it consolidate at <br>>graph time. Yes, this will mean you will have to wait longer for your graph <br>>to be made, but you save processing time at every update.<br><br>This is interesting I did not think about that. Thanks for the hint.<br><br>Thanks for your help again.<br>m<br><br><br><br>--<br>View this message in context: <a href="http://rrd-mailinglists.937164.n2.nabble.com/max-DS-per-rrd-file-tp7580966p7580971.html"
target="_blank">http://rrd-mailinglists.937164.n2.nabble.com/max-DS-per-rrd-file-tp7580966p7580971.html</a><br>Sent from the RRDtool Users Mailinglist mailing list archive at Nabble.com.<br><br>_______________________________________________<br>rrd-users mailing list<br><a ymailto="mailto:rrd-users@lists.oetiker.ch" href="mailto:rrd-users@lists.oetiker.ch">rrd-users@lists.oetiker.ch</a><br><a href="https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users" target="_blank">https://lists.oetiker.ch/cgi-bin/listinfo/rrd-users</a><br><br><br></div> </div> </div> </div></body></html>