Thread Regex und das n-te vorkommen eines strings im HTML (2 answers)
Opened by raphael_s at 2010-12-20 12:00

murphy
 2010-12-20 16:30
#143748 #143748
User since
2004-07-19
1776 Artikel
HausmeisterIn
[Homepage]
user image
Ich würde das mittels XPath lösen. Zum Beispiel so:
Code (perl): (dl )
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
use 5.012;
use warnings;

use Data::Dumper;
use XML::LibXML;

my $xml = XML::LibXML->load_html(string => <<EOF);
<html>
<head>
  <title>Test</title>
</head>
<body>
  <table>
    <tr>
      <td class="title">Infotext 1</td>
      <td class="title">Infotext 3</td>
      <td class="title">Infotext 4</td>
    </tr>
    <tr>
      <td>Wert 1</td>
      <td>Wert 3</td>
      <td>Wert 4</td>
    </tr>
  </table>
</body>
EOF

my %assoc;
for my $key ($xml->findnodes('//td[@class = "title"]')) {
    my $pos =
        $key->findvalue('count(./preceding-sibling::td) + 1');
    my ($val) =
        $key->findnodes("../following-sibling::tr[1]/td[position() = $pos]");

    $assoc{$key->textContent} = $val->textContent;
}

print Dumper \%assoc;
When C++ is your hammer, every problem looks like your thumb.

View full thread Regex und das n-te vorkommen eines strings im HTML