Thread Charsetproblem: utf8 zu HTML-Codierung (28 answers)
Opened by Gast at 2005-07-20 23:46

GwenDragon
 2005-07-21 14:53
#56532 #56532
User since
2005-01-17
14837 Artikel
Admin1
[Homepage]
user image
Quote
perldoc HTML::Entities

NAME
   HTML::Entities - Encode or decode strings with HTML entities

SYNOPSIS
    use HTML::Entities;

    $a = "Våre norske tegn bør &#230res";
    decode_entities($a);
    encode_entities($a, "\200-\377");

   For example, this:

    $input = "vis-Ó-vis BeyoncÚ's na&´ve\npapier-mÔchÚ rÚsumÚ";
    print encode_entities($in), "\n"

   Prints this out:

    vis-à-vis Beyoncé's naïve
    papier-mâché résumé

DESCRIPTION
   This module deals with encoding and decoding of strings with HTML
   character entities. The module provides the following functions:

   decode_entities( $string )
       This routine replaces HTML entities found in the $string with the
       corresponding ISO-8859-1 character, and if possible (under perl 5.8
       or later) will replace to Unicode characters. Unrecognized entities
       are left alone.

       This routine is exported by default.

   encode_entities( $string )
   encode_entities( $string, $unsafe_chars )
       This routine replaces unsafe characters in $string with their entity
       representation. A second argument can be given to specify which
       characters to consider unsafe (i.e., which to escape). The default
       set of characters to encode are control chars, high-bit chars, and
       the "<", "&", ">", and """ characters. But this, for example, would
       encode *just* the "<", "&", ">", and """ characters:

         $escaped = encode_entities($input, '<>&"');

       This routine is exported by default.

   encode_entities_numeric( $string )
   encode_entities_numeric( $string, $unsafe_chars )
       This routine works just like encode_entities, except that the
       replacement entities are always "&#x*hexnum*;" and never
       "&*entname*;". For example, "escape_entities("r\xF4le")" returns
       "r&ocirc;le", but "escape_entities_numeric("r\xF4le")" returns
       "r&#xF4;le".

       This routine is *not* exported by default. But you can always export
       it with "use HTML::Entities qw(encode_entities_numeric);" or even
       "use HTML::Entities qw(:DEFAULT encode_entities_numeric);"

   All these routines modify the string passed as the first argument, if
   called in a void context. In scalar and array contexts, the encoded or
   decoded string is returned (without changing the input string).

   If you prefer not to import these routines into your namespace, you can
   call them as:

     use HTML::Entities ();
     $decoded = HTML::Entities::decode($a);
     $encoded = HTML::Entities::encode($a);
     $encoded = HTML::Entities::encode_numeric($a);

   The module can also export the %char2entity and the %entity2char hashes,
   which contain the mapping from all characters to the corresponding
   entities (and vice versa, respectively).

COPYRIGHT
   Copyright 1995-2003 Gisle Aas. All rights reserved.

   This library is free software; you can redistribute it and/or modify it
   under the same terms as Perl itself.

View full thread Charsetproblem: utf8 zu HTML-Codierung