Thread use utf8 und Verhalten von String-Operationen (65 answers)
Opened by rosti at 2011-08-03 19:16

pq
 2011-08-02 20:39
#150982 #150982
User since
2003-08-04
12208 Artikel
Admin1
[Homepage]
user image
ich sagte ja, use utf8 wirkt sich nur auf variablen aus, die direkt im skript stehen.
es geht aber bei einer webseite (auch) um strings, die von aussen kommen.

vergleiche:
ohne decode, kaputter string in $x:
Code: (dl )
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$ perl -wE'
use utf8;
use CGI;
use Devel::Peek;
use Encode;
my $cgi = CGI->new;

my $test = $cgi->param("test");

Dump $test;
my $x = substr($test, 0, 1);
Dump $x;' "test=%C3%A4"
SV = PVMG(0x81a6fe8) at 0x81ada78
REFCNT = 1
FLAGS = (PADMY,POK,pPOK)
IV = 0
NV = 0
PV = 0x8309588 "\303\244"\0
CUR = 2
LEN = 4
SV = PV(0x82a8758) at 0x82aa8c0
REFCNT = 1
FLAGS = (POK,pPOK)
PV = 0x8193620 "\303"\0
CUR = 1
LEN = 4


mit decode, korrekter string in $x:
Code: (dl )
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
$ perl -wE'
use utf8;
use CGI;
use Devel::Peek;
use Encode;
my $cgi = CGI->new;

my $test = decode_utf8 $cgi->param("test"); # <--- decode

Dump $test;
my $x = substr($test, 0, 1);
Dump $x;' "test=%C3%A4"
SV = PV(0x8872868) at 0x8777a80
REFCNT = 1
FLAGS = (PADMY,POK,pPOK,UTF8)
PV = 0x8834e80 "\303\244"\0 [UTF8 "\x{e4}"]
CUR = 2
LEN = 4
SV = PV(0x8872858) at 0x8841818
REFCNT = 1
FLAGS = (POK,pPOK,UTF8)
PV = 0x8834e90 "\303\244"\0 [UTF8 "\x{e4}"]
CUR = 2
LEN = 4

Last edited: 2011-08-02 20:40:52 +0200 (CEST)
Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. -- Damian Conway in "Perl Best Practices"
lesen: Wiki:Wie frage ich & perlintro Wiki:brian's Leitfaden für jedes Perl-Problem

View full thread use utf8 und Verhalten von String-Operationen