<html><head><title>Re: [smokeping-users] Severe lag when restarting Smokeping and webpage timing out.</title>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
</head>
<body>
<span style=" font-family:'Courier New'; font-size: 9pt;">[Tue Mar 04 15:15:36 2014] [warn] [client 192.168.1.66] mod_fcgid: read data timeout in 40 seconds, referer: </span><a style=" font-family:'Courier New'; font-size: 9pt;" href="http://pipeline/">http://pipeline/</a><br>
<br>
<span style=" font-family:'Courier New'; font-size: 9pt;">Looks like the smokeping cgi times out reading data. <br>
Is this box I/O bound?<br>
What does top show when you try to get a web-page from SP? [load averages in particular]<br>
<br>
In any case, you need to figure out why the CGI is failing to read the data in the allowed time of 40 secs.<br>
Changing the default time-out might help if the box is I/O bound, but not totally buried. [And I'm not sure where that might be.]<br>
<br>
However, if the box is seriously overloaded I/O wise, then waiting longer won't really solve your problem - it will just push the box further below the water.<br>
[And this all gets back to - how many RRD's and how big are they. See the database section. Are there slaves? If so, how many?]<br>
<br>
Finally:<br>
>Is fping being ran as soon as the cgi script is executed from the webserver?<br>
<br>
You appear to misunderstand how SP works. The daemon runs fping and logs the results and writes to the RRD's. The CGI pulls data from the RRD and generates graphs for the http output.<br>
<br>
It appears from the debug log from SP that writing the data went fine. [At least for the small subset of targets.]<br>
However reading the RRD's and generating the graphs appears to fail/timeout when reading the RRD's. [Or reading something - in any case.]<br>
<br>
Is selinux or apparmour running? If so, then stop them or run in permissive mode and see if that helps.<br>
<br>
<br>
-Greg<br>
<br>
</span><table>
<tr>
<td width=2 bgcolor= #0000ff><br>
</td>
<td width=1098><span style=" font-family:'courier new'; font-size: 9pt;">Forgot to add the smoke.log:<br>
<br>
</span><a style=" font-family:'courier new'; font-size: 9pt;" href="http://pastebin.com/20UbvJVx">http://pastebin.com/20UbvJVx</a><br>
<br>
<span style=" font-family:'courier new'; font-size: 9pt;">At the bottom of the log you can see that I also tried timing fping (the same command that smokeping was running) and it looks like it took 19.3 seconds to run for a small number of machines. Would that cause it to time out? Is fping being ran as soon as the cgi script is executed from the webserver?<br>
<br>
<br>
<br>
On Tue, Mar 4, 2014 at 4:10 PM, Brett Bronson <</span><a style=" font-family:'courier new'; font-size: 9pt;" href="mailto:brett.bronson@bigblockla.com">brett.bronson@bigblockla.com</a><span style=" font-family:'courier new'; font-size: 9pt;">> wrote:<br>
Here is the apache error log that is listing smokeping:<br>
</span><a style=" font-family:'courier new'; font-size: 9pt;" href="http://pastebin.com/Knm1Cmw1">http://pastebin.com/Knm1Cmw1</a><br>
<br>
<span style=" font-family:'courier new'; font-size: 9pt;">As for debug mode, here's my output:<br>
</span><a style=" font-family:'courier new'; font-size: 9pt;" href="http://pastebin.com/8txnhnkv">http://pastebin.com/8txnhnkv</a><br>
<br>
<span style=" font-family:'courier new'; font-size: 9pt;">The host names do resolve; here's an example:<br>
<span style=" color: #444444;">[04:07 PM]superuser@pipeline[/opt/smokeping/bin] > time fping larender001a<br>
larender001a is alive<br>
<br>
real 0m0.014s<br>
user 0m0.000s<br>
sys 0m0.000s<br>
<br>
<br>
<br>
<span style=" color: #000000;">On Tue, Mar 4, 2014 at 3:32 PM, Brett Bronson <</span></span></span><a style=" font-family:'courier new'; font-size: 9pt;" href="mailto:brett.bronson@bigblockla.com">brett.bronson@bigblockla.com</a><span style=" font-family:'courier new'; font-size: 9pt;">> wrote:<br>
Also, it looks like the version I have running is actually the latest, I assumed it would output the version as 2.6.9. Sorry<br>
<br>
<br>
On Tue, Mar 4, 2014 at 3:29 PM, Brett Bronson <</span><a style=" font-family:'courier new'; font-size: 9pt;" href="mailto:brett.bronson@bigblockla.com">brett.bronson@bigblockla.com</a><span style=" font-family:'courier new'; font-size: 9pt;">> wrote:<br>
Okay, it looks like I was actually using an older version of smokeping. I've removed it and installed the latest version on the site and my config is as follows: <br>
</span><a style=" font-family:'courier new'; font-size: 9pt;" href="http://pastebin.com/ZsLE8uCp">http://pastebin.com/ZsLE8uCp</a><br>
<br>
<span style=" font-family:'courier new'; font-size: 9pt;">Before, I was able to get smokeping to work fine up until I added the section:<br>
<br>
+ nodes<br>
menu = Render Node Latency<br>
title = Render Node Latency (ICMP Pings)<br>
<br>
++ larender001a<br>
host = larender001a<br>
++ larender001b<br>
host = larender001b<br>
++ larender001c<br>
host = larender001c<br>
++ larender001d<br>
host = larender001d<br>
<br>
++ larender002a<br>
host = larender002a<br>
++ larender002b<br>
host = larender002b<br>
++ larender002c<br>
host = larender002c<br>
++ larender002d<br>
host = larender002d<br>
<br>
<br>
<br>
Now that I look at the logs, it looks like it's still using the old version....<br>
[ ... ]<br>
Tue Mar 4 15:03:05 2014 - FPing: probing 5 targets with step 300 s and offset 116 s.<br>
Tue Mar 4 15:16:01 2014 - Smokeping version 2.006009 successfully launched.<br>
Tue Mar 4 15:16:01 2014 - Not entering multiprocess mode for just a single probe.<br>
Tue Mar 4 15:16:01 2014 - FPing: probing 13 targets with step 300 s and offset 163 s.<br>
Tue Mar 4 15:25:59 2014 - Smokeping version 2.006009 successfully launched.<br>
Tue Mar 4 15:25:59 2014 - Not entering multiprocess mode for just a single probe.<br>
Tue Mar 4 15:25:59 2014 - FPing: probing 13 targets with step 300 s and offset 159 s.<br>
<br>
Before, I used sudo apt-get install smokeping to install, but I later removed it using sudo apt-get remove smokeping; however, it looks like it didn't remove the old version? Any idea how I could resolve this so that it loads up the newer version?<br>
<br>
<br>
<br>
<br>
<br>
On Tue, Mar 4, 2014 at 2:28 PM, Gregory Sloop <</span><a style=" font-family:'courier new'; font-size: 9pt;" href="mailto:gregs@sloop.net">gregs@sloop.net</a><span style=" font-family:'courier new'; font-size: 9pt;">> wrote:<br>
I don't see a database section, so I assume it's somewhere else. [Nothing looks obviously wrong - but that was just a quick glance.]<br>
<br>
But when you first start SP after adding a bunch of targets, it's going to have to allocate/create the RRD for each of the targets. <br>
[Also, are there slaves, because it will create X * 60 new RRD's - where X is how many slave SP instances you have. (In addition to the master RRD's) ]<br>
<br>
I wouldn't think that would take 10m, but I can't see how much data you're stuffing in each RRD, or if you have slaves, which might help explain it.<br>
<br>
As to why web-pages won't work, I'm not sure. Have you looked at the apache logs to see what they say? Or run SP in debug mode? [smokeping --debug<br>
IIRC]<br>
<br>
-Greg<br>
<br>
</span><table>
<tr>
<td width=3 bgcolor= #0000ff><br>
</td>
<td width=1021><span style=" font-family:'courier new'; font-size: 9pt;">Hello,<br>
<br>
I recently updated my smokeping Target configuration to include about 60 of our machines in our render farm and noticed that restarting the smokeping service took about 10 minutes, and now our webpage will not load.<br>
<br>
Any ideas?<br>
<br>
My config:<br>
</span><a style=" font-family:'courier new'; font-size: 9pt;" href="http://pastebin.com/ibNmGhAF">http://pastebin.com/ibNmGhAF</a><br>
<br>
<br>
<span style=" font-family:'courier new'; font-size: 9pt;">-- <br>
Brett Bronson<br>
Big Block | Pipeline TD<br>
</span><a style=" font-family:'courier new'; font-size: 9pt;" href="http://www.bigblockla.com">http://www.bigblockla.com</a><br>
<span style=" font-family:'courier new'; font-size: 9pt;">[m] </span><a style=" font-family:'courier new'; font-size: 9pt;" href="tel:805-338-6520">805-338-6520</a></td>
</tr>
</table>
<br><br>
<br>
<br>
<br>
<br>
<span style=" font-family:'courier new'; font-size: 9pt;">-- <br>
Brett Bronson<br>
Big Block | Pipeline TD<br>
</span><a style=" color: #000000; font-family:'courier new'; font-size: 9pt;" href="http://www.bigblockla.com">http://www.bigblockla.com</a><br>
<span style=" font-family:'courier new'; font-size: 9pt;">[m] </span><a style=" font-family:'courier new'; font-size: 9pt;" href="tel:805-338-6520">805-338-6520</a><br>
<br>
<br>
<br>
<br>
<span style=" font-family:'courier new'; font-size: 9pt;">-- <br>
Brett Bronson<br>
Big Block | Pipeline TD<br>
</span><a style=" color: #000000; font-family:'courier new'; font-size: 9pt;" href="http://www.bigblockla.com">http://www.bigblockla.com</a><br>
<span style=" font-family:'courier new'; font-size: 9pt;">[m] </span><a style=" font-family:'courier new'; font-size: 9pt;" href="tel:805-338-6520">805-338-6520</a><br>
<br>
<br>
<br>
<br>
<span style=" font-family:'courier new'; font-size: 9pt;">-- <br>
Brett Bronson<br>
Big Block | Pipeline TD<br>
</span><a style=" color: #000000; font-family:'courier new'; font-size: 9pt;" href="http://www.bigblockla.com">http://www.bigblockla.com</a><br>
<span style=" font-family:'courier new'; font-size: 9pt;">[m] </span><a style=" font-family:'courier new'; font-size: 9pt;" href="tel:805-338-6520">805-338-6520</a><br>
<br>
<br>
<br>
<br>
<span style=" font-family:'courier new'; font-size: 9pt;">-- <br>
Brett Bronson<br>
Big Block | Pipeline TD<br>
</span><a style=" color: #000000; font-family:'courier new'; font-size: 9pt;" href="http://www.bigblockla.com">http://www.bigblockla.com</a><br>
<span style=" font-family:'courier new'; font-size: 9pt;">[m] 805-338-6520<br>
</td>
</tr>
</table>
<br><br>
<span style=" font-family:'arial'; color: #c0c0c0;"><i>-- <br>
Gregory Sloop, Principal: Sloop Network & Computer Consulting<br>
Voice: 503.251.0452 x82<br>
EMail: </i></span><a style=" font-family:'arial';" href="mailto:gregs@sloop.net">gregs@sloop.net</a><br>
<a style=" font-family:'arial';" href="http://www.sloop.net">http://www.sloop.net</a><br>
<span style=" font-family:'arial'; color: #c0c0c0;"><i>---</body></html>