<html><head><title>Re: [smokeping-users] Smokeping performance and scalability on high resolution probes</title>
<META http-equiv=Content-Type content="text/html; charset=iso-8859-1">
</head>
<body>
<span style=" font-family:'Courier New'; font-size: 9pt;">A step time of five seconds is pretty short. [An understatement]<br>
<br>
I can't imagine you need that level of granularity. And if you keep any substantial history your RRD database will be quite large.<br>
<br>
I use a 60 second step for production, and I've never felt I needed higher resolution. If I've got some problem that's occuring at intervals of less than one minute, I'm probably going to be using wireshark or something else real-time to capture it and evaluate it. Obviously you'll have to decide yourself what level of granularity you need, but IMO, going with a 5 second poll is way overkill.<br>
<br>
I'm not even sure that SP can handle a sub 60 second step - it seems I tinkered with a 30 second poll and it wigged out - but that's been a long time ago and my recollection is fuzzy.<br>
<br>
I'd guess if you live with a >=60 sec poll, most of the issues you're complaining about go away.<br>
<br>
---<br>
That said:<br>
There are things you can do to change the fping's behavior, see the "-T", "-t", "-n" and "-p" options.<br>
By sending pings faster, closer together, not waiting for replies as long etc, you can pump more of them out.<br>
This can obviously cause *lots* of other problems and impact the accuracy and utility of fping and smokeping, but may be appropriate in some cases. [I assume you're attempting to measure very high-speed links and thus the reason for <60 second polls, in which case, running fping in a very high-performance mode may make sense. If you do this to a DSL connection, for example, expect completely bogus/wrong/misleading results.]<br>
<br>
I think there are similar things you can do with TCPPing, but can't recall.<br>
<br>
---<br>
Given your DB settings, you're keeping ~53 hours of full res data, but only 3 days of 60 sec data, and 6 days of 12min data. Seems really odd choices for data retention to me... Smokeping is really not intended, IMO, to be a real-time, very-short granularity tool. It's strength is moderate granularity and long-term data retention that allows you to see trends and long-term patterns. So, I think you're using the tool in ways it was never intended to be used and are having problems as a result.<br>
<br>
It would be like using MTRG with a 1 second poll.<br>
<br>
HTH<br>
<br>
-Greg<br>
<br>
</span><table>
<tr>
<td width=3 bgcolor= #0000ff><br>
</td>
<td width=1021><br><br>
<span style=" font-family:'courier new'; font-size: 9pt;">Hi,<br>
<br>
I'm currently trying to configure a smokeping server for diagnostic of our network.<br>
<br>
I have manage to make it work but <span style=" font-size: 12pt;">I'm having a couple of issues :<br>
<br>
<span style=" font-size: 9pt;">1) How can I reduce the number of ping without getting this error when smokeping generate the graph (In this example, I have set the config to 10 pings) :<br>
<br>
"ERROR: No DS called 'ping11' in <rrd file><br>
<br>
2) The other problem is with the performance of tcpping or fping. The probe is working but it take too long to complete. As my step is 5 second, I need the probe to be as quick as possible.<br>
Sure, if I can reduce the number of ping, I will gain some time. The concurentprobes variable is on (tried off too). <br>
<br>
I currently have 30 targets running with Fping, wich complete in 5 sec (limit for the 5 sec of my interval).<br>
If I switch to TCPPing, it take 42 sec.<br>
<br>
My config: <br>
<br>
*** Database ***<br>
step = 3<br>
pings = 5<br>
<br>
# consfn mrhb steps total<br>
<br>
AVERAGE 0.5 1 38400<br>
AVERAGE 0.5 12 4320<br>
MIN 0.5 12 4320<br>
MAX 0.5 12 4320<br>
AVERAGE 0.5 144 720<br>
MAX 0.5 144 720<br>
MIN 0.5 144 720<br>
<br>
*** Presentation ***<br>
<br>
template = /opt/smokeping/etc/basepage.html<br>
<br>
+ overview<br>
<br>
width = 600<br>
height = 50<br>
range = 1h<br>
<br>
+ detail<br>
<br>
width = 600<br>
height = 200<br>
unison_tolerance = 2<br>
<br>
"Last 3 Hours" 3h<br>
"Last 30 Hours" 30h<br>
"Last 10 Days" 10d<br>
"Last 400 Days" 400d<br>
<br>
*** Probes ***<br>
<br>
+ TCPPing<br>
<br>
binary = /usr/bin/tcpping<br>
forks = 500<br>
offset = random<br>
step = 3<br>
timeout = 2<br>
pings = 5<br>
port = 22<br>
<br>
+ FPing<br>
binary = /usr/sbin/fping<br>
blazemode = true<br>
hostinterval = 0.001<br>
mininterval = 0.001<br>
offset = random<br>
packetsize = 12<br>
pings = 5<br>
step = 3<br>
timeout = 1.5<br>
<br>
The machine : <br>
model name : Intel(R) Xeon(R) CPU X5660 @ 2.80GHz<br>
MemTotal: 1922436 kB<br>
<br>
The OS :<br>
CentOS release 6.5 (Final)<br>
Linux 2.6.32-431.11.2.el6.x86_64 #1 SMP Tue Mar 25 19:59:55 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux<br>
<br>
The App:<br>
SmokePing-2.6.9<br>
fping: Version 3.10<br>
tcpping v1.7 Richard van den Berg<br>
RRDtool 1.3.8<br>
___________________<br>
<i>Louis</td>
</tr>
</table>
<br><br>
</body></html>