Thread TableParser: Daten aus Tabellen auslesen (9 answers)
Opened by TomBombadil at 2007-05-30 18:52

TomBombadil
 2007-06-06 10:45
#77074 #77074
User since
2007-05-30
4 Artikel
BenutzerIn
[default_avatar]
Mit deinem Input hab ich jetzt mal folgenden code getippselt - auch was für die Ausgabe. Indes kreiert er mir keine txt-files, zudem scheint das script bei Anwendung gar nicht zu stoppen :-( Liegt die Lösung vielleicht in der Verknüpfung der Blocks $ts, $te mit dem Block OUTPUTFILE? Hmm...

Code: (dl )
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
#!C:\perl\bin\perl.exe -w

# Purpose: Script for parsing html, especially information in tables
# Created by: Tom Bombadil, June 6, 2007
# Version: 0.2

print "Content-type: text/html\n\n";

use CGI::Carp qw(fatalsToBrowser);
use strict;
use HTML::TableExtract;

my $table;                                                 # table of interest
my $html_file = "http://www.securityfocus.com/bid";        # url of web site
my $te;                                                    # table extract
my $ts;                                                    # table search
my $row;                                                   # row of table of interest
my @securityfocus;                                         # array


@securityfocus=("Bugtraq ID: \n","Class: \n","CVE: \n","Remote: \n","Local: \n",
"Published: \n","Updated: \n","Credit: \n","Vulnerable: \n","Not Vulnerable: \n");
open(OUTPUTFILE,">bid.txt");
print OUTPUTFILE @securityfocus;
close(OUTPUTFILE);

open(OUTPUTFILE,"bid.txt");
while (<OUTPUTFILE>)
{
chomp;
print " $_ \n";
}
close(OUTPUTFILE);

# Depth represents how deeply a table resides in other tables. The depth of a top-level
# table in the document is 0. A table within a top-level table has a depth of 1, and so
# on. Each depth can be thought of as a layer; tables sharing the same depth are on the
# same layer. Within each of these layers, Count represents the order in which a table
# was seen at that depth, starting with 0. Providing both a depth and a count will
# uniquely specify a table within a document -> the table of interest is on the second
# level (depth = 1), the first one (count = 0).

for(1..30000) {
 my $table = $html_file."/".$_;
 $te = HTML::TableExtract->new( depth => 1, count => 0 );
 $te->parse_file($table);
}

foreach $ts ($te->tables) {
  print "Table found at ", join(',', $ts->coords), ":\n";
  foreach $row ($ts->rows) {
      print "   ", join(',', @$row), "\n";
   }
}
\n\n

<!--EDIT|TomBombadil|1181112385-->

View full thread TableParser: Daten aus Tabellen auslesen