[mrtg] Parsing HTML GUI
McDonald, Dan
Dan.McDonald at austinenergy.com
Tue Sep 2 14:32:54 CEST 2008
On Mon, 2008-09-01 at 18:11 +0100, Mick wrote:
> On Wednesday 27 August 2008, McDonald, Dan wrote:
> > >(First message to the list!)
> > >
> > >My modem (a 2WIRE 1800HG used in fully-bridged mode) does not offer telnet
> > > or SNMP access. I can only get to it via http/s. Unfortunately, I have
> > > no knowledge of perl, to be able to hack a script for this purpose.
> >
> > I wrote up something like that a long time ago. I used LWP.pm as the main
> > parsing engine. Unfortunately, that was several jobs ago, so I don't
> > believe I have the source.
>
> Thank you Dan, I have hacked something that looks as if it works, although it
> probably is a terrible kludge.
Actually, it looks pretty good. Now we just need to turn the goo your
modem spits out into something useful:
> > But LWP is a fairly simple module to use in perl, so I would recommend that
> > you take this as a great opportunity to be introduced to perl.
>
> . . . talking about a crash course in Perl! :p
But you won't be nearly as afraid of perl next time...
> > >Happy to post the HTML page with the stats if needed.
> >
> > Not until you get to the point that you need help with the regex.
>
> Hmm, I probably need a crash course in regex too. ;-)
> The data is all in tables in the html page. The titles are shown as:
> ========================================
> <td></td>
> <td class="columnheaderborder">Rate</td>
> <td class="columnheaderborder">Max1</td>
> <td class="columnheaderborder">Max2</td>
> <td class="columnheaderborder">Max3</td>
> <td class="columnheaderborder">Mgn1</td>
> <td class="columnheaderborder">Mgn2</td>
> <td class="columnheaderborder">Attn</td>
> <td class="columnheaderborder">Pwr</td>
> <td class="columnheaderborder">CRCs</td>
> <td class="columnheaderborder">FECs</td>
> </tr>
> <tr>
> ========================================
> but the values I am interested in are further down the page like so:
> ========================================
> <tr>
> <td nowrap="nowrap">+000 days 13:48:59</td>
> <td nowrap="nowrap">1</td>
> <td></td>
> <td nowrap="nowrap">7616</td>
> <td nowrap="nowrap">7616</td>
> <td nowrap="nowrap">7040</td>
> <td nowrap="nowrap">7040</td>
> <td nowrap="nowrap">6.0</td>
> <td nowrap="nowrap">3.0</td>
> <td nowrap="nowrap">38.0</td>
> <td nowrap="nowrap">20.2</td>
> <td nowrap="nowrap">5694</td>
> <td nowrap="nowrap">583127</td>
> <td></td>
> </tr>
> ========================================
# OK, let's convert everything into an array, split on </td> boundaries
my @data = split(qw(</td>),$req->content);
# Then see if we can parse this into something somewhat usable
my ($index, at header,%content);
$index=0;
foreach my $datum (@data) {
if ( $datum =~ /columnheaderborder/) {
my ($head) = ($datum =~ />(.+?)$/);
push @header,$head;
}
if ($datum =~ /nowrap="nowrap"/) {
my ($value) = ($datum =~ />(.+?)$/);
$content{$header[$index]} = $value;
$index++;
}
}
# assuming there weren't any extraneous headers,
# we should now have a hash indexed by the headers:
> I would like to capture and graph:
>
> Max1 Vs Max2
print "$content{'Max1'}\n$content{'Max2'}\n\n\n";
> Mgn1 Vs Mgn2
print "$content{'Mgn1'}\n$content{'Mgn2'}\n\n\n";
--
Daniel J McDonald, CCIE #2495, CISSP #78281, CNX
Austin Energy
http://www.austinenergy.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 197 bytes
Desc: This is a digitally signed message part
Url : http://lists.oetiker.ch/pipermail/mrtg/attachments/20080902/519578db/attachment.bin
More information about the mrtg
mailing list