[smokeping-users] Re: Scalability
Marc Powell
mpowell at ena.com
Wed Jun 19 20:43:45 MEST 2002
All,
Well, I just wanted to give an update on my progress as some of you
might find it interesting or at least mildly entertaining (look at the
fool! Look at him!). I am successfully (more or less) monitoring 1494
hosts utilizing 4 Sun Ultra5's as data collectors which rsync the
resulting RRD files every 3 hours to a Sun E420r for presentation
purposes (and possibly importing into a DB in the future). The impact of
smokeping on the operation of the machines is absolutely negligible
(thanks Tobi!). Each data collector has its own config and the central
host has a master config that is the accumulation of all the dc configs.
Each data collector is responsible for a particular region (currently
4). In the master config, the tree is laid out in the following manner:
<state>
<region1>
<a-e>
<hosts that start with a..e>
<f-j>
<k-o>
etc...
<region2>
<a-e>
etc...
With this layout, the highest number of hosts in a particular view is
about 175, which displays in about 15 seconds.
The only major problem I am having is that I see gaps in the graphs
(10-15 minutes) for those regions with relatively high numbers of hosts
down (20-30). We're monitoring schools so it's the off season here in
the US and the routers fluctuate depending on what maintenance is going
on at the schools, whether the janitor has spilt his coffee in the
router, etc... I am attributing the gaps to the slower response time for
ICMP UNREACHABLE's from fping, which lengthens the overall time it takes
before smokeping spawns the next run to 10-15 minutes or longer. Since
smokeping appears to wait until the fping process terminates before
writing to the RRDs or spawning the next fping process, the gaps are
appearing for all hosts in a region. To minimize the number of hosts
affected by this problem, I have just implemented unique configurations
for each alphabetical grouping per region so that I can spawn a
smokeping daemon for each grouping as opposed to each region (i.e. 5
smokeping processes per data collector). As a result, I have a feature
request or two to make things easier:
1) Add a pidfile directive to either complement or replace
piddir. Currently, it is necessary to either create a directory for each
pid file specifically or remove the pidfile before starting the next
smokeping process.
2) The ability to INCLUDE external files within a config file.
This should help cut down on the number of unique files I'm having to
create.
Well, that's it. I certainly welcome any and all feedback on ways to
improve on what I've done, what I'm totally doing wrong or anything
else...
--
Marc
-----Original Message-----
From: Tobias Oetiker [mailto:oetiker at ee.ethz.ch]
Sent: Sunday, June 16, 2002 10:31 AM
To: Marc Powell
Cc: Smokeping
Subject: Re: [smokeping-users] Scalability
Today Marc Powell wrote:
> Tobi et al,
>
> I am really interested in scaling up my use of Smokeping
> significantly (on the order of 1500 hosts) and was wanting to get
> the lists opinion on how well this might work. My chief concern
> is how well the smokeping.cgi will behave with that many hosts to
> display. I would have a number of sub-menus and the largest
> submenu might have maybe 300 hosts in it. I know that there are
> also likely to be concerns about data collection but I plan to
> implement a number of external data collectors and rsync the rrd
> files back to the central reporting station on a regular basis.
> I've been quite successful doing this for a Cricket installation
> and don't see any major roadblocks to this. Are there any other
> issues that I might have to contend with that could be classified
> as show-stoppers? Any other thoughts?
Hi Marc,
the polling part, as long as you use fping should work fine, as for
the cgi, creating 300 graphs will take in the order of 1 minute ...
so I would suggest you order your tree in a manner that not too
many hosts are sitting on the same brance, then it should be no
problem ... Smokeping is certainly much more efficient in
generating graphs than cricket as all the graphs get generated by
the process which displays the webpage, and not through additional
processes which get launched upon dispaly of the grpah (maybe
cricket fixed this in the mean time)
tobi
>
> TIA,
>
> Marc
>
>
> --
> Unsubscribe
mailto:smokeping-users-request at list.ee.ethz.ch?subject=unsubscribe
> Help
mailto:smokeping-users-request at list.ee.ethz.ch?subject=help
> Archive http://www.ee.ethz.ch/~slist/smokeping-users
> WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
>
--
______ __ _
/_ __/_ / / (_) Oetiker, ETZ J97, ETH, 8092 Zurich, Switzerland
/ // _ \/ _ \/ / phoneto:+41(0)1-632-5286 faxto:+41(0)1-632-1517
/_/ \.__/_.__/_/ oetiker at ee.ethz.ch http://google.com/search?q=tobi
--
Unsubscribe mailto:smokeping-users-request at list.ee.ethz.ch?subject=unsubscribe
Help mailto:smokeping-users-request at list.ee.ethz.ch?subject=help
Archive http://www.ee.ethz.ch/~slist/smokeping-users
WebAdmin http://www.ee.ethz.ch/~slist/lsg2.cgi
More information about the smokeping-users
mailing list