[smokeping-users] Re: Scalability
Dan Tucny
dan at tucny.com
Sat Jun 22 03:29:19 MEST 2002
See my reply to the 'question on probe timing' thread about this,
however, I'll go into some more detail specific to your problems here...
Theoretical limit on default fping settings would be 600 hosts in 5
mins.
Failures shouldn't affect timing as a there is a 500ms timeout on pings,
so with the 1 second between pings to the same host that give a whole
half second of spare time after the packet has been flagged MIA and the
host is next sampled...
In reality I'd work at closer to 500 hosts in 5 mins as being a limit at
fping from some very simple tests I have done... i.e.
fping -C 20 -q -s <40 reachable, 40 unreachable hosts>
80 targets
40 alive
40 unreachable
0 unknown addresses
0 timeouts (waiting for response)
1600 ICMP Echos sent
800 ICMP Echo Replies received
53 other ICMP received
0.38 ms (min round trip time)
0.45 ms (avg round trip time)
1.94 ms (max round trip time)
48.886 sec (elapsed real time)
fping -C 20 -q -s <80 reachable hosts>
80 targets
80 alive
0 unreachable
0 unknown addresses
0 timeouts (waiting for response)
1600 ICMP Echos sent
1600 ICMP Echo Replies received
0 other ICMP received
0.35 ms (min round trip time)
0.42 ms (avg round trip time)
1.95 ms (max round trip time)
48.856 sec (elapsed real time)
fping -C 20 -q -s <80 unreachable hosts>
80 targets
0 alive
80 unreachable
0 unknown addresses
0 timeouts (waiting for response)
1600 ICMP Echos sent
0 ICMP Echo Replies received
89 other ICMP received
0.00 ms (min round trip time)
0.00 ms (avg round trip time)
0.00 ms (max round trip time)
48.951 sec (elapsed real time)
To increase the sample rate, I've also tried fping with for example -i
12.5 to reduce the default per packet wait time from 25ms to 12.5ms
which should result in a theoretical limit of 1200 hosts in 5 minutes,
however from the results I've obtained here, it looks closer to 750
hosts...
fping -C 20 -q -s -i 12.5 <40 reachable, 40 unreachable hosts>
80 targets
40 alive
40 unreachable
0 unknown addresses
0 timeouts (waiting for response)
1600 ICMP Echos sent
800 ICMP Echo Replies received
36 other ICMP received
0.35 ms (min round trip time)
0.42 ms (avg round trip time)
1.65 ms (max round trip time)
32.863 sec (elapsed real time)
fping -C 20 -q -s -i 12.5 <80 reachable hosts>
80 targets
80 alive
0 unreachable
0 unknown addresses
0 timeouts (waiting for response)
1600 ICMP Echos sent
1600 ICMP Echo Replies received
0 other ICMP received
0.36 ms (min round trip time)
0.41 ms (avg round trip time)
2.05 ms (max round trip time)
32.866 sec (elapsed real time)
fping -C 20 -q -s -i 12.5 <80 unreachable hosts>
80 targets
0 alive
80 unreachable
0 unknown addresses
0 timeouts (waiting for response)
1600 ICMP Echos sent
0 ICMP Echo Replies received
62 other ICMP received
0.00 ms (min round trip time)
0.00 ms (avg round trip time)
0.00 ms (max round trip time)
32.958 sec (elapsed real time)
The debug output you have below is due to fping always returning errors,
even when running -q, this shouldn't affect the runtime of fping itself
though.
This is of course all purely looking at fping, there is of course also
the time taken for Smokeping to process these results to be taken into
consideration though I don't have any timings for that...
I hope this is helpful to you...
Dan
-----Original Message-----
From: smokeping-users-bounce at list.ee.ethz.ch
[mailto:smokeping-users-bounce at list.ee.ethz.ch] On Behalf Of Marc Powell
Sent: 20 June 2002 01:30
To: Tobias Oetiker
Cc: Smokeping
Subject: [smokeping-users] Re: Scalability
Sure thing. Here is what I have done, I created a test smokeping binary
that points to my original config file with 546 hosts on this particular
data collector. I ran it with -debug and -nodaemon (I think debug
implies nodaemon, but I wanted to cover all bases).
# [smokep at dc2 ~/bin]date ; ./smokeping.test -debug -nodaemon ; date
Wed Jun 19 19:20:10 CDT 2002
### fping seems to report in 1 miliseconds
Launched successfully
FPing: probing 546 targets
Wed Jun 19 19:28:11 CDT 2002
This 8 minute duration seems to be fairly consistent, at least right now
;)
Here's a snippet of truss about 4 minutes into the run:
[smokep at dc2 ~]$ date
Wed Jun 19 19:23:49 CDT 2002
[smokep at dc2 ~]$ truss -fea -p 14837
14837: psargs: /usr/local/bin/perl -w ./smokeping.test -debug -nodaemon
14837: read(7, " I C M P T i m e E x".., 5120) = 70
14837: read(7, 0x004E380C, 5120) (sleeping...)
14837: read(7, " I C M P T i m e E x".., 5120) = 69
14837: read(7, " I C M P T i m e E x".., 5120) = 70
14837: read(7, 0x004E380C, 5120) (sleeping...)
14837: read(7, " I C M P T i m e E x".., 5120) = 70
14837: read(7, " I C M P T i m e E x".., 5120) = 74
14837: read(7, " I C M P T i m e E x".., 5120) = 70
14837: read(7, " I C M P T i m e E x".., 5120) = 75
14837: read(7, 0x004E380C, 5120) (sleeping...)
14837: read(7, " I C M P T i m e E x".., 5120) = 70
14837: read(7, 0x004E380C, 5120) (sleeping...)
14837: read(7, " I C M P T i m e E x".., 5120) = 18
14837: read(7, " f r o m ", 5120) = 6
14837: read(7, " 1 0 . 5 5 . 0 . 1 1", 5120) = 10
14837: read(7, " f o r I C M P E c".., 5120) = 23
14837: read(7, " 1 7 2 . 3 1 . 5 6 . 2", 5120) = 11
14837: read(7, "\n", 5120) = 1
14837: read(7, 0x004E380C, 5120) (sleeping...)
14837: read(7, " I C M P T i m e E x".., 5120) = 70
14837: read(7, 0x004E380C, 5120) (sleeping...)
14837: read(7, " I C M P T i m e E x".., 5120) = 70
14837: read(7, " I C M P T i m e E x".., 5120) = 74
14837: read(7, " I C M P T i m e E x".., 5120) = 70
14837: read(7, " I C M P T i m e E x".., 5120) = 75
14837: read(7, 0x004E380C, 5120) (sleeping...)
14837: read(7, " I C M P T i m e E x".., 5120) = 70
14837: read(7, 0x004E380C, 5120) (sleeping...)
14837: read(7, " I C M P T i m e E x".., 5120) = 69
14837: read(7, " I C M P T i m e E x".., 5120) = 70
14837: read(7, 0x004E380C, 5120) (sleeping...)
^C[smokep at dc2 ~]$ date
Wed Jun 19 19:24:44 CDT 2002
If there is anything else that I can provide that would be of
assistance, please don't hesitate to let me know.
Thanks,
Marc
-----Original Message-----
From: Tobias Oetiker [mailto:oetiker at ee.ethz.ch]
Sent: Wed 6/19/2002 5:26 PM
To: Marc Powell
Cc: Smokeping
Subject: Re: [smokeping-users] Re: Scalability
Yesterday Marc Powell wrote:
> The only major problem I am having is that I see gaps in the
graphs
> (10-15 minutes) for those regions with relatively high numbers
of hosts
> down (20-30). We're monitoring schools so it's the off season
here in
> the US and the routers fluctuate depending on what maintenance
is going
> on at the schools, whether the janitor has spilt his coffee in
the
> router, etc... I am attributing the gaps to the slower
response time for
> ICMP UNREACHABLE's from fping, which lengthens the overall
time it takes
> before smokeping spawns the next run to 10-15 minutes or
longer. Since
> smokeping appears to wait until the fping process terminates
before
> writing to the RRDs or spawning the next fping process, the
gaps are
> appearing for all hosts in a region. To minimize the number of
hosts
> affected by this problem, I have just implemented unique
configurations
> for each alphabetical grouping per region so that I can spawn
a
> smokeping daemon for each grouping as opposed to each region
(i.e. 5
> smokeping processes per data collector). As a result, I have a
feature
> request or two to make things easier:
try running smokeping by hand, at least in theory it should ping
ALL the hosts in your config in parallel. The time it will wait
for
a 'lost' paket is about 1 second at most so this means in theory
a
fping run is over in 20 seconds regardless of the number of
machines involved. Now there is a small gap between each icmp
packet sent out from fping, so there is an impact per machine
but
it should not at all depend on how long the machine has to
answer
... this after all is the whole motivation behinde fping ...
> 1) Add a pidfile directive to either complement or
replace
> piddir. Currently, it is necessary to either create a
directory for each
> pid file specifically or remove the pidfile before starting
the next
> smokeping process.
running multiple smokeping processes is not the solution ... if
fping has a bug, we will fix fping ...
> 2) The ability to INCLUDE external files within a config
file.
> This should help cut down on the number of unique files I'm
having to
> create.
this is already there ... check the documentation on
ISG::ParseConfig
cheers
tobi
--
______ __ _
/_ __/_ / / (_) Oetiker, OETIKER+PARTNER AG, Gallusstrasse 25
/ // _ \/ _ \/ / CH-4600 Olten, phoneto:+41(0)62 213 9909
/_/ \.__/_.__/_/ tobi at oetiker.ch http://google.com/search?q=tobi
--
Unsubscribe
mailto:smokeping-users-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:smokeping-users-request at list.ee.ethz.ch?subject=help
Archive http://www.ee.ethz.ch/~slist/smokeping-users
WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
--
Unsubscribe mailto:smokeping-users-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:smokeping-users-request at list.ee.ethz.ch?subject=help
Archive http://www.ee.ethz.ch/~slist/smokeping-users
WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
More information about the smokeping-users
mailing list