Font
[thread]5679[/thread]

Unicode (ä, ö, ü in German) Problem with File::Fi

Tags: Similar Threads

Readers: 2


<< >> 10 entries, 1 page
TeddyC
 2003-09-01 18:11
#82711 #82711
User since
2003-08-28
2 articles
BenutzerIn
[default_avatar]
I 'm new to Perl and not be able to locate my problem exactly. I've searched but can't find the right solution.I hope here is a right place to post.

Env: Win2000(en) +ActivePerl 5.8+ Komodo

The trouble begins with special characters in German.
I want to copy a whole directory to another location , a prototype got no problem with direct writing paths of source  and destination in program:
1) define $srcdir in perl-program
            $srcdir in perl-program
like this:
Code: (dl )
1
2
3
4
5
6
7
8
use File::Find;
# there are directory d:\perl\source\test\öa  and file d:\perl\source\test\öa\ü.txt under \test\

$srcdir="d:\\perl\\source\\test";  
$dstdir="d:\\test2";
finddepth(\&check_and_copy, $srcdir);

# sub check_and_copy  check the time of file/dir and only copy the updated file/dir to $dstdir

Then I wrote srcdir & dstdir in config.xml and use XML:Simple to read and got them in UTF8 format but  nothing is copied!
2) define $srcdir in xml
            $dstdir in xml

Komodo shows that  first $_ in "finddepth" is "d:\perl\source\test\öa"  (ö in Hex)
There are someting not correct: it should be "d:\perl\source\test\öa\ü.txt" by depth first search( just like in 1) )

Then I run the Program under Dos, it shows a warning: "Can't cd to <d:\perl\source\test/> öa ..." (here ö can't display correctly")

--Is there something wrong in  File::Find with utf8 or with utf8 under win2k?
--Or it depends on somethings else?


I made some tests but I can't understand the results well.

3) define $srcdir in perl-program
            $dstdir in xml

# the files is copied but the charater shows in Ascii format.

4) define $srcdir in perl-program and test
Code: (dl )
1
2
3
4
5
6
7
$srcdir="d:\\perl\\source\\test";  
if(Encode::is_utf8($srcdir) ){    
print " src is_utf8\n";
}else{
print " src is_NOT_utf8\n";
}
# I got "src is_NOT_utf8",  I heared Perl use UTF8 internal. but it seems that's not so simple.

5).define $srcdir with "ö" in perl-program and use utf8

Code: (dl )
1
2
3
use utf8;
$srcdir="d:\\perl\\source\\test\\öa";  
# I got "Malformed UTF-8 character (unexpected non-continuation byte 0x61, immediately after start byte 0xf6)"

--What 's the encoding of a Perl- program in Komodo?
--How can I got the  encoding's name  of a string?
--How can I got the right output  for "ä, ö ,ü" in  DOS? (I got correct character in Komodo with "binmode (STDOUT,"utf8");" )



Thanks  for your attention!\n\n

<!--EDIT|TeddyC|1062436104-->
snadra
 2003-09-01 18:44
#82712 #82712
User since
2003-08-11
265 articles
BenutzerIn
[Homepage] [default_avatar]
Hello,

I think I cannot help you with your problem, since I am no Windows user.
I just wanted to recommend the Perlmonks website to you. The URL is: http://www.perlmonks.org
There you can find an English speaking board, with highly professional users. That does not mean, that I don't want you to post here anymore, but you might get better answers at Perlmonks, because most of the users of this site are German.
But of course you are welcome to post here as well...

Cheers
snadra
http://hamburg.pm.org
jeden 2. mittwoch im monat
--
#!/usr/bin/perl -w
$l=join('',map chr,(116,110,105,114,112))if$^T;
!!$$?@_=qw(Jhfg Aabgure Prey Hnpxre):$l=1;
for(@_){eval reverse($l)."'"._(_(_($_))).' \''}
sub _{$_=~y+a-z+n-za-m+and pop}
Crian
 2003-09-01 19:50
#82713 #82713
User since
2003-08-04
5829 articles
ModeratorIn
[Homepage]
user image
I think german users are a quite good choise to help by the described problem. I didn't noticed problems with File::Find and umlauts until now, but i will take a test too, because your environment is exactly like my one :)

I will post here again, if I find anything to report to you about your problem.
s--Pevna-;s.([a-z]).chr((ord($1)-84)%26+97).gee; s^([A-Z])^chr((ord($1)-52)%26+65)^gee;print;

use strict; use warnings; Link zu meiner Perlseite
Crian
 2003-09-01 19:54
#82714 #82714
User since
2003-08-04
5829 articles
ModeratorIn
[Homepage]
user image
[quote=TeddyC,01.09.2003, 16:11]finddepth(/&check_and_copy, $srcdir);[/quote]
I have noticed, that you are using a wrong character in front of your subroutine call, you have to use backslash instead of slash ... my testing is on the run.

Edit: Typo\n\n

<!--EDIT|Crian|1062432078-->
s--Pevna-;s.([a-z]).chr((ord($1)-84)%26+97).gee; s^([A-Z])^chr((ord($1)-52)%26+65)^gee;print;

use strict; use warnings; Link zu meiner Perlseite
Crian
 2003-09-01 20:00
#82715 #82715
User since
2003-08-04
5829 articles
ModeratorIn
[Homepage]
user image
Perhaps that was the point?

Here is my test and result:

Code: (dl )
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#!/usr/bin/perl
use diagnostics;
use strict;
use warnings;

use File::Find;

# there are directory c:\temp\test\öa
# and file c:\temp\test\öa\ü.txt


my $srcdir = "c:/temp/test/öa";
my $dstdir = "c:/temp/test2";

my @Files;
finddepth(\&make_something, $srcdir);

sub make_something {
push @Files, $File::Find::name;
}

use Data::Dumper;
print Dumper \@Files;


Code: (dl )
1
2
3
4
5
C:\Daten\perl\forum>filefind_und_umlaute.pl
$VAR1 = [
'c:/temp/test/÷a/õ.txt',
'c:/temp/test/÷a'
];


in a Dos-Box.
s--Pevna-;s.([a-z]).chr((ord($1)-84)%26+97).gee; s^([A-Z])^chr((ord($1)-52)%26+65)^gee;print;

use strict; use warnings; Link zu meiner Perlseite
TeddyC
 2003-09-01 22:15
#82716 #82716
User since
2003-08-28
2 articles
BenutzerIn
[default_avatar]
Hallo und Danke für die Beachtung!
Ich muss sagen, Deutsch kann ich auch. Aber bin ich nicht sicher, dass ich dieses Fach-Problem gut erklären kann.
Ich denke, dass die deutsche Perl programmier mit mehrerer Erfahrungen haben, mit solche Thema zu beschäftigen. :)
Ich habe zuerst www.perl.de probiert, aber leider...

Crian, du hast Recht, das ist mein Typo bei Post. Ich habe korrigiert, Danke!
Was du probiert hast, ist genau ein Punkt davon. ich finde, die  "ö" und "ü" kann wegen Komodo wie normal dargestellt werden, aber die Encoding ist bestimmt anders bei der Interpretation, und es ist wahrseinlich kein UTF8, da ich mit String in UTF8 aus XML  noch andere komische Zeichen kommt habe. Und bekommst du andere Reihfolde in deiner Array.
du kannst  Xml lesen,  mit:
Code: (dl )
1
2
3
4
5
6
7
8
9
10
11
#Tiped, kein Typolos Garantie, aber dein Komodo wird kontrolliert :)

use XML::Simple; #Easy API to maintain XML (esp config files)
my $config=XMLin("config.xml"); # Hash


<?xml version="1.0" encoding="UTF-8" ?>
<config>
 <srcdir>yourSrcDir</srcdir>
 <dstdir>yourDstDir</dstdir>
</config>

Ich habe probiert, mit Find.pm in Komodo zu debugen, aber es gibt zu viel fremd Kode für mich,  es ist Hartarbeit, aber der einzige Weg, die ich außer "Post in Forum" ausdenken kann,  ich hoffe es ein richtig Weg ist.
Crian
 2003-09-02 15:08
#82717 #82717
User since
2003-08-04
5829 articles
ModeratorIn
[Homepage]
user image
What is exactly the next step after my tries? To cd into the dierectories and copy the files (instead of just printing them)?
s--Pevna-;s.([a-z]).chr((ord($1)-84)%26+97).gee; s^([A-Z])^chr((ord($1)-52)%26+65)^gee;print;

use strict; use warnings; Link zu meiner Perlseite
kabel
 2003-09-02 15:39
#82718 #82718
User since
2003-08-04
704 articles
BenutzerIn
[default_avatar]
FYI: hier ist der Perlmonks:thread.

mal gucke, was sich da tuen tut ;) :)
-- stefan
Crian
 2003-09-02 15:41
#82719 #82719
User since
2003-08-04
5829 articles
ModeratorIn
[Homepage]
user image
Naja, wenn die Frage hier gestellt wird, kann man sie eigentlich ja auch hier beantworten. Mir ist nur noch nicht ganz klar, was die Frage ist.
s--Pevna-;s.([a-z]).chr((ord($1)-84)%26+97).gee; s^([A-Z])^chr((ord($1)-52)%26+65)^gee;print;

use strict; use warnings; Link zu meiner Perlseite
snadra
 2003-09-02 15:49
#82720 #82720
User since
2003-08-11
265 articles
BenutzerIn
[Homepage] [default_avatar]
Well, I am not sure about the question either. I thought, that Crian solved your problem, or didn't he?
What do you mean by 'debugging Komodo'?
I just upvoted your thread on Perlmonks, because I thought it is well written for a beginner. On the other hand, I thought Crian solved it...
Please describe the remaining problem(s) more precisely.
http://hamburg.pm.org
jeden 2. mittwoch im monat
--
#!/usr/bin/perl -w
$l=join('',map chr,(116,110,105,114,112))if$^T;
!!$$?@_=qw(Jhfg Aabgure Prey Hnpxre):$l=1;
for(@_){eval reverse($l)."'"._(_(_($_))).' \''}
sub _{$_=~y+a-z+n-za-m+and pop}
<< >> 10 entries, 1 page



View all threads created 2003-09-01 18:11.