<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
  <head>
    <meta content="text/html; charset=ISO-8859-1"
      http-equiv="Content-Type">
  </head>
  <body bgcolor="#ffffff" text="#000000">
    Sadly interesting...<br>
    As a separate data point, we're running over 100 rrdcached servers,
    each handling &gt;30k tree nodes and receiving about 3k updates/sec,
    caching data for ~1 hour so updating files at ~20 updates/sec.
    Uptime in months without problem, never seen corruption (knock on
    wood). We're running 1.4 trunk revision r2092 (randomly picked) on
    Ubuntu 8.04 (used to run on CentOS 5.2, I believe). We're not seeing
    any memory leak and running stable at 800-900MB virtual / 500-600MB
    rss. We're using TCP sockets and doing updates, fetches and flushes.
    The command line we use is:<br>
    /usr/bin/rrdcached -w 3600 -z 3600 -f 7200 -t 2 -a 128 -b
    /rrds/hosts -B -j /rrds/journal -p /var/run/rrdcached/rrdcached.pid
    -l 10.x.x.x:xxxx<br>
    I'm not writing this to contradict you, I'm just wondering what
    could be different in your set-up that causes the problems. (Oh,
    that reminds me that the -a 128 made a huge difference for us around
    memory allocation performance.)<br>
    Good luck!<br>
    TvE<br>
    <br>
    On 10/21/2010 6:50 PM, Steve Shipway wrote:
    <blockquote
cite="mid:28E447343A85354483BCF7C3E9D5EAA5149A37AB@uxcn10-1.UoA.auckland.ac.nz"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html;
        charset=ISO-8859-1">
      <meta name="Generator" content="Microsoft Word 12 (filtered
        medium)">
      <!--[if !mso]>
<style>
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
</style>
<![endif]-->
      <style>
<!--
 /* Font Definitions */
 @font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
@font-face
        {font-family:Tahoma;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:Verdana;
        panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
        {font-family:Webdings;
        panose-1:5 3 1 2 1 5 9 6 7 3;}
@font-face
        {font-family:"Arial Narrow";
        panose-1:2 11 6 6 2 2 2 3 2 4;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0cm;
        margin-bottom:.0001pt;
        font-size:12.0pt;
        font-family:"Times New Roman","serif";}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
        {mso-style-priority:99;
        color:purple;
        text-decoration:underline;}
span.EmailStyle17
        {mso-style-type:personal-reply;
        font-family:"Calibri","sans-serif";
        color:#1F497D;}
.MsoChpDefault
        {mso-style-type:export-only;}
@page WordSection1
        {size:612.0pt 792.0pt;
        margin:72.0pt 72.0pt 72.0pt 72.0pt;}
div.WordSection1
        {page:WordSection1;}
-->
</style><!--[if gte mso 9]><xml>
 <o:shapedefaults v:ext="edit" spidmax="1026" />
</xml><![endif]--><!--[if gte mso 9]><xml>
 <o:shapelayout v:ext="edit">
  <o:idmap v:ext="edit" data="1" />
 </o:shapelayout></xml><![endif]-->
      <div class="WordSection1">
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);">The corrupted file ends up the correct size;
            however the entire
            file is filled with zeroes (fortunately, we archive our RRD
            files nightly so I
            can go back and retrieve the last uncorrupted version plus
            the corrupted
            version)<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);"><o:p>&nbsp;</o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);">The system is not (normally) memory or
            process-constrained;
            there is in fact nothing to speak of running apart from
            apache and the
            rrdcached daemon.&nbsp; The rrdinfo response is &#8216;not an RRD
            file&#8217;,
            since it doesn&#8217;t have the RRD header.<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);"><o:p>&nbsp;</o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);">It has run fine for a whole week at these rates
            before the
            problem hit; so that&#8217;s why I think it might be a leak in the
            RRD
            functions (which would of course not show up in a non-daemon
            situation).&nbsp;
            We use the remote update, info and (occasionally) create via
            the TCP socket;
            plus the info, last, flush and fetch via the UNIX socket.<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);"><o:p>&nbsp;</o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);">The build is the absolute latest r2136 .<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);"><o:p>&nbsp;</o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);">The memory usage of the rrdcached process is
            definitely
            increasing; however that may also be due to the number of
            items in the
            queue?&nbsp; It is currently at 768m virtual, 560m physical (17%
            usage) which
            seems somewhat high to me, even for 20,000+ RRD files.&nbsp;
            Eventually it will
            hit address-space limits (this is a 32bit RHEL5 box with 4G
            physical memory)<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);"><o:p>&nbsp;</o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);">Unfortunately I don&#8217;t have any of the nice
            developer tools
            for tracking memory leaks&#8230;<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);"><o:p>&nbsp;</o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);">Steve<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);"><o:p>&nbsp;</o:p></span></p>
        <div class="MsoNormal" style="text-align: center;"
          align="center"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);" lang="EN-US">
            <hr width="100%" align="center" size="2">
          </span></div>
        <p class="MsoNormal"><b><span style="font-size: 11pt;
              font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;;
              color: rgb(31, 73, 125);">Steve Shipway<o:p></o:p></span></b></p>
        <p class="MsoNormal"><span style="font-size: 10pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);">ITS Unix Services Design Lead<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 10pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);">University of Auckland, New Zealand<o:p></o:p></span></p>
        <p class="MsoNormal"><span style="font-size: 10pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);">Floor 1, 58 Symonds Street, Auckland<o:p></o:p></span></p>
        <p class="MsoNormal"><i><span style="font-size: 10pt;
              font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;;
              color: rgb(89, 89, 89);">Phone: +64 (0)9 3737599 ext 86487<o:p></o:p></span></i></p>
        <p class="MsoNormal"><i><span style="font-size: 10pt;
              font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;;
              color: rgb(89, 89, 89);">DDI: +64 (0)9 924 6487<o:p></o:p></span></i></p>
        <p class="MsoNormal"><i><span style="font-size: 10pt;
              font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;;
              color: rgb(89, 89, 89);">Mobile: +64 (0)21 753 189<o:p></o:p></span></i></p>
        <p class="MsoNormal"><i><span style="font-size: 10pt;
              font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;;
              color: rgb(89, 89, 89);">Email: <a moz-do-not-send="true"
                href="mailto:s.shipway@auckland.ac.nz"><span
                  style="color: rgb(89, 89, 89);">s.shipway@auckland.ac.nz</span></a><o:p></o:p></span></i></p>
        <p class="MsoNormal"><span style="font-size: 18pt; font-family:
            Webdings; color: green;" lang="EN-GB">P</span><span
            style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: blue;"
            lang="EN-GB"> </span><span style="font-size: 10pt;
            font-family: &quot;Arial
            Narrow&quot;,&quot;sans-serif&quot;; color: green;"
            lang="EN-GB">Please consider the environment before printing
            this e-mail</span><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: blue;"
            lang="EN-GB"> </span><span style="font-size: 7.5pt;
            font-family: &quot;Verdana&quot;,&quot;sans-serif&quot;;
            color: navy;" lang="EN-GB"><o:p></o:p></span></p>
        <p class="MsoNormal"><i><span style="font-size: 10pt;
              font-family: &quot;Calibri&quot;,&quot;sans-serif&quot;;
              color: rgb(31, 73, 125);"><o:p>&nbsp;</o:p></span></i></p>
        <p class="MsoNormal"><span style="font-size: 11pt; font-family:
            &quot;Calibri&quot;,&quot;sans-serif&quot;; color: rgb(31,
            73, 125);"><o:p>&nbsp;</o:p></span></p>
        <div style="border-width: medium medium medium 1.5pt;
          border-style: none none none solid; border-color:
          -moz-use-text-color -moz-use-text-color -moz-use-text-color
          blue; padding: 0cm 0cm 0cm 4pt;">
          <div>
            <div style="border-right: medium none; border-width: 1pt
              medium medium; border-style: solid none none;
              border-color: rgb(181, 196, 223) -moz-use-text-color
              -moz-use-text-color; padding: 3pt 0cm 0cm;">
              <p class="MsoNormal"><b><span style="font-size: 10pt;
                    font-family:
                    &quot;Tahoma&quot;,&quot;sans-serif&quot;;"
                    lang="EN-US">From:</span></b><span style="font-size:
                  10pt; font-family:
                  &quot;Tahoma&quot;,&quot;sans-serif&quot;;"
                  lang="EN-US"> kevin brintnall
                  [<a class="moz-txt-link-freetext" href="mailto:kbrint@rufus.net">mailto:kbrint@rufus.net</a>] <br>
                  <b>Sent:</b> Friday, 22 October 2010 1:40 p.m.<br>
                  <b>To:</b> Steve Shipway<br>
                  <b>Cc:</b> <a class="moz-txt-link-abbreviated" href="mailto:rrd-developers@lists.oetiker.ch">rrd-developers@lists.oetiker.ch</a>;
                  <a class="moz-txt-link-abbreviated" href="mailto:rrd-users@lists.oetiker.ch">rrd-users@lists.oetiker.ch</a><br>
                  <b>Subject:</b> Re: [rrd-developers] rrdcached use
                  corrupting RRD files (trunk)<o:p></o:p></span></p>
            </div>
          </div>
          <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
          <p class="MsoNormal">Sebastian,<o:p></o:p></p>
          <div>
            <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
          </div>
          <div>
            <p class="MsoNormal">I don't think the problem is specific
              to rrdcached; it uses
              normal librrd API. &nbsp;This problem likely affects any RRD
              access in a memory
              constrained system.<o:p></o:p></p>
          </div>
          <div>
            <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
          </div>
          <div>
            <p class="MsoNormal">Is there a lack of memory (or address
              space if 32-bit) on
              the system? &nbsp;Or is it running up against per-process
              limits?<o:p></o:p></p>
          </div>
          <div>
            <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
          </div>
          <div>
            <p class="MsoNormal">How does the file end up? &nbsp;Is it the
              right size?
              &nbsp;What errors do you get (i.e. when you "rrdtool info").
              &nbsp;What architecture are you running on? &nbsp;mmap() under
              failure
              conditions is likely to be OS-specific.<o:p></o:p></p>
          </div>
          <div>
            <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
          </div>
          <div>
            <p class="MsoNormal">What revision of trunk?<o:p></o:p></p>
          </div>
          <div>
            <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
          </div>
          <div>
            <p class="MsoNormal">Let us know what you find re: memory
              leak.<o:p></o:p></p>
          </div>
          <div>
            <p class="MsoNormal"><o:p>&nbsp;</o:p></p>
          </div>
          <div>
            <p class="MsoNormal" style="margin-bottom: 12pt;">-kb<o:p></o:p></p>
            <div>
              <p class="MsoNormal">On Thu, Oct 21, 2010 at 5:07 PM,
                Steve Shipway &lt;<a moz-do-not-send="true"
                  href="mailto:s.shipway@auckland.ac.nz">s.shipway@auckland.ac.nz</a>&gt;
                wrote:<o:p></o:p></p>
              <div>
                <div>
                  <p class="MsoNormal" style="">I&#8217;ve
                    had this happen too often now for it to be a fluke.&nbsp;
                    OK, so I&#8217;m
                    using the trunk version of rrdtool 1.4, but (as far
                    as I know) there is nothing
                    in there to modify the update code.&nbsp; We have a high
                    update frequency
                    &#8211; approx. 20,000 MRTG targets at 5min intervals,
                    which equates to about
                    70 updates per second, and it took about a week for
                    the problem to first hit.<o:p></o:p></p>
                  <p class="MsoNormal" style="">&nbsp;<o:p></o:p></p>
                  <p class="MsoNormal" style="">It
                    seems that something is happening on update,
                    possibly involving memory
                    allocation failure, that results in a corrupted
                    file.<o:p></o:p></p>
                  <p class="MsoNormal" style="">&nbsp;<o:p></o:p></p>
                  <p class="MsoNormal" style="">I
                    have some processes that may be reading the file
                    without using the rrdcached,
                    but all updates are certainly going this way (no
                    data collection is run on this
                    server any more, it all comes over TCP)<o:p></o:p></p>
                  <p class="MsoNormal" style="">&nbsp;<o:p></o:p></p>
                  <p class="MsoNormal" style="">Selected
                    error logs show:<o:p></o:p></p>
                  <p class="MsoNormal" style="">listen_thread_main:
                    pthread_create failed.<o:p></o:p></p>
                  <p class="MsoNormal" style="">queue_thread_main:
                    rrd_update_r (/u01/rrdtool/maildelivery-mx1.rrd)
                    failed with status -1.
                    (mmaping file '/u01/rrdtool/maildelivery-mx1.rrd':
                    Cannot allocate memory)<o:p></o:p></p>
                  <p class="MsoNormal" style=""><i>&nbsp;
                      &nbsp;(restarted rrdcached here)</i><o:p></o:p></p>
                  <p class="MsoNormal" style="">replaying
                    from journal:
                    /u01/rrdtool/journal/rrd.journal.1285603416.766523<o:p></o:p></p>
                  <p class="MsoNormal" style="">Replayed
                    61011 entries (0 failures)<o:p></o:p></p>
                  <p class="MsoNormal" style="">replaying
                    from journal:
                    /u01/rrdtool/journal/rrd.journal.1285607016.766153<o:p></o:p></p>
                  <p class="MsoNormal" style="">Malformed
                    journal entry at line 31024<o:p></o:p></p>
                  <p class="MsoNormal" style="">Replayed
                    31023 entries (1 failures)<o:p></o:p></p>
                  <p class="MsoNormal" style="">journal
                    processing complete<o:p></o:p></p>
                  <p class="MsoNormal" style="">queue_thread_main:
                    rrd_update_r (/u01/rrdtool/maildelivery-mx1.rrd)
                    failed with status -1.
                    ('/u01/rrdtool/maildelivery-mx1.rrd' is not an RRD
                    file)<o:p></o:p></p>
                  <p class="MsoNormal" style="">&nbsp;<o:p></o:p></p>
                  <p class="MsoNormal" style="">Although
                    there was only one journal failure, there were in
                    fact several RRD files
                    corrupted (I suspect the ones which were open at the
                    time of the memory
                    failure?) and even more with the rrd_update_r memory
                    allocation failure.<o:p></o:p></p>
                  <p class="MsoNormal" style="">&nbsp;<o:p></o:p></p>
                  <p class="MsoNormal" style="">It
                    seems that the memory ran out (memory leak?) and
                    somewhere in the rrd_update_r
                    something was half-done.&nbsp; The resultant corrupted
                    RRD file doesn&#8217;t
                    even load in rrdtool, seems the header is corrupt &#8211;
                    I don&#8217;t (yet)
                    understand enough of the mmap code to work out what
                    could be causing
                    this.&nbsp; I&#8217;m also trying to track the memory usage of
                    the rrdcached
                    process to see if it is indeed growing due to a
                    leak.<o:p></o:p></p>
                  <p class="MsoNormal" style="">&nbsp;<o:p></o:p></p>
                  <p class="MsoNormal" style="">I
                    think there are two bugs here &#8211; first, the memory
                    leak causing the
                    failure, and second, something in the code is not
                    correctly handling a memory
                    allocation failure and corrupts the RRD file as a
                    result.<o:p></o:p></p>
                  <p class="MsoNormal" style="">&nbsp;<o:p></o:p></p>
                  <p class="MsoNormal" style="">Has
                    anyone else experienced this?&nbsp; And, more to the
                    point, any RRD developers
                    who understand the MMAP update code want to take a
                    look or give some pointers?<o:p></o:p></p>
                  <p class="MsoNormal" style="">&nbsp;<o:p></o:p></p>
                  <p class="MsoNormal" style="">Steve<o:p></o:p></p>
                  <p class="MsoNormal" style="">&nbsp;<o:p></o:p></p>
                  <div class="MsoNormal" style="text-align: center;"
                    align="center"><span lang="EN-US">
                      <hr width="100%" align="center" size="2">
                    </span></div>
                  <p class="MsoNormal" style=""><b>Steve
                      Shipway</b><o:p></o:p></p>
                  <p class="MsoNormal" style=""><span style="font-size:
                      10pt;">ITS Unix Services Design Lead</span><o:p></o:p></p>
                  <p class="MsoNormal" style=""><span style="font-size:
                      10pt;">University of Auckland, New Zealand</span><o:p></o:p></p>
                  <p class="MsoNormal" style=""><span style="font-size:
                      10pt;">Floor 1, 58 Symonds Street, Auckland</span><o:p></o:p></p>
                  <p class="MsoNormal" style=""><i><span
                        style="font-size: 10pt; color: rgb(89, 89, 89);">Phone:
                        +64 (0)9 3737599 ext 86487</span></i><o:p></o:p></p>
                  <p class="MsoNormal" style=""><i><span
                        style="font-size: 10pt; color: rgb(89, 89, 89);">DDI:
                        +64 (0)9 924 6487</span></i><o:p></o:p></p>
                  <p class="MsoNormal" style=""><i><span
                        style="font-size: 10pt; color: rgb(89, 89, 89);">Mobile:
                        +64 (0)21 753 189</span></i><o:p></o:p></p>
                  <p class="MsoNormal" style=""><i><span
                        style="font-size: 10pt; color: rgb(89, 89, 89);">Email:
                        <a moz-do-not-send="true"
                          href="mailto:s.shipway@auckland.ac.nz"
                          target="_blank"><span style="color: rgb(89,
                            89, 89);">s.shipway@auckland.ac.nz</span></a></span></i><o:p></o:p></p>
                  <p class="MsoNormal" style=""><span style="font-size:
                      18pt; font-family: Webdings; color: green;"
                      lang="EN-GB">P</span><span style="color: blue;"
                      lang="EN-GB"> </span><span style="font-size:
                      10pt; color: green;" lang="EN-GB">Please consider
                      the environment before printing this e-mail</span><span
                      style="color: blue;" lang="EN-GB"> </span><o:p></o:p></p>
                  <p class="MsoNormal" style=""><i><span
                        style="font-size: 10pt;">&nbsp;</span></i><o:p></o:p></p>
                  <p class="MsoNormal" style="">&nbsp;<o:p></o:p></p>
                </div>
              </div>
              <p class="MsoNormal" style="margin-bottom: 12pt;"><br>
                _______________________________________________<br>
                rrd-developers mailing list<br>
                <a moz-do-not-send="true"
                  href="mailto:rrd-developers@lists.oetiker.ch">rrd-developers@lists.oetiker.ch</a><br>
                <a moz-do-not-send="true"
                  href="https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers"
                  target="_blank">https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers</a><o:p></o:p></p>
            </div>
            <p class="MsoNormal" style="margin-bottom: 12pt;"><br>
              <br clear="all">
              <br>
              -- <br>
              &nbsp;kevin brintnall =~ /<a moz-do-not-send="true"
                href="http://kbrint@rufus.net/">kbrint@rufus.net/</a><o:p></o:p></p>
          </div>
        </div>
      </div>
      <pre wrap="">
<fieldset class="mimeAttachmentHeader"></fieldset>
_______________________________________________
rrd-developers mailing list
<a class="moz-txt-link-abbreviated" href="mailto:rrd-developers@lists.oetiker.ch">rrd-developers@lists.oetiker.ch</a>
<a class="moz-txt-link-freetext" href="https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers">https://lists.oetiker.ch/cgi-bin/listinfo/rrd-developers</a>
</pre>
    </blockquote>
  </body>
</html>