perlcn

TriggerTek Logo
abcdefghijklmnopqrstuvwxyz_
XXX
XXX WARNING: old character encoding and/or character set
XXX
PERLCN(1)	       Perl Programmers Reference Guide		    PERLCN(1)



NAME
       perlcn - ¼òÌåÖÐÎÄ Perl Ö¸ÄÏ

DESCRIPTION
       »¶Ó­À´µ½ Perl µÄÌìµØ!

       ´Ó 5.8.0 °æ¿ªÊ¼, Perl ¾ß±¸ÁËÍêÉÆµÄ Unicode (ͳһÂë) Ö§Ô®,
       Ò²Á¬´øÖ§Ô®ÁËÐí¶àÀ­¶¡ÓïϵÒÔÍâµÄ±àÂ뷽ʽ; CJK (ÖÐÈÕº«) ±ãÊÇÆäÖеÄÒ»²¿·Ý.
       Unicode Êǹú¼ÊÐԵıê×¼, ÊÔͼº­¸ÇÊÀ½çÉÏËùÓеÄ×Ö·û: Î÷·½ÊÀ½ç, ¶«·½ÊÀ½ç,
       ÒÔ¼°Á½Õß¼äµÄÒ»ÇÐ (Ï£À°ÎÄ, ÐðÀûÑÇÎÄ, ÑÇÀ­²®ÎÄ, Ï£²®À´ÎÄ, Ó¡¶ÈÎÄ,
       Ó¡µØ°²ÎÄ, µÈµÈ). ËüÒ²ÈÝÄÉÁ˶àÖÖ×÷ҵϵͳÓëÆ½Ì¨ (Èç PC ¼°Âó½ðËþ).

       Perl ±¾ÉíÒÔ Unicode ½øÐвÙ×÷. Õâ±íʾ Perl ÄÚ²¿µÄ×Ö·û´®Êý¾Ý¿ÉÓà Unicode
       ±íʾ; Perl µÄº¯Ê½ÓëËã·û (ÀýÈçÕý¹æ±íʾʽ±È¶Ô) Ò²ÄÜ¶Ô Unicode ½øÐвÙ×÷.
       ÔÚÊäÈë¼°Êä³öʱ, ΪÁË´¦ÀíÒÔ Unicode ֮ǰµÄ±àÂ뷽ʽ´æ·ÅµÄÊý¾Ý, Perl
       ÌṩÁË Encode Õâ¸öÄ£¿é, ¿ÉÒÔÈÃÄãÇáÒ׵ضÁÈ¡¼°Ð´Èë¾ÉÓеıàÂëÊý¾Ý.

       Encode ÑÓÉìÄ£¿éÖ§Ô®ÏÂÁмòÌåÖÐÎĵıàÂ뷽ʽ (’gb2312’ ±íʾ ’euc-cn’):

	   euc-cn      Unix ÑÓÉì×Ö·û¼¯, Ò²¾ÍÊÇË׳ƵĹú±êÂë
	   gb2312-raw  δ¾­´¦ÀíµÄ (µÍ±ÈÌØ) GB2312 ×Ö·û±í
	   gb12345     δ¾­´¦ÀíµÄÖйúÓ÷±ÌåÖÐÎıàÂë
	   iso-ir-165  GB2312 + GB6345 + GB8565 + ÐÂÔö×Ö·û
	   cp936       ×ÖÂëÒ³ 936, Ò²¿ÉÒÔÓà ’GBK’ (À©³ä¹ú±êÂë) Ö¸Ã÷
	   hz	       7 ±ÈÌØÒݳöʽ GB2312 ±àÂë

       ¾ÙÀýÀ´Ëµ, ½« EUC-CN ±àÂëµÄµµ°¸×ª³É Unicode, ìóÐè¼üÈëÏÂÁÐÖ¸Áî:

	   perl -Mencoding=euc-cn,STDOUT,utf8 -pe1 < file.euc-cn > file.utf8

       Perl Ò²ÄÚ¸½ÁË "piconv", Ò»Ö§ÍêÈ«ÒÔ Perl д³ÉµÄ×Ö·ûת»»¹¤¾ß³ÌÐò,
       Ó÷¨ÈçÏÂ:

	   piconv -f euc-cn -t utf8 < file.euc-cn > file.utf8
	   piconv -f utf8 -t euc-cn < file.utf8 > file.euc-cn

       ÁíÍâ, ÀûÓà encoding Ä£¿é, Äã¿ÉÒÔÇáÒ×д³öÒÔ×Ö·ûΪµ¥Î»µÄ³ÌÐòÂë,
       ÈçÏÂËùʾ:

	   #!/usr/bin/env perl
	   # Æô¶¯ euc-cn ×Ö´®½âÎö; ±ê×¼Êä³öÈë¼°±ê×¼´íÎó¶¼ÉèΪ euc-cn ±àÂë
	   use encoding ’euc-cn’, STDIN => ’euc-cn’, STDOUT => ’euc-cn’;
	   print length("ÂæÍÕ");	    #  2 (Ë«ÒýºÅ±íʾ×Ö·û)
	   print length(’ÂæÍÕ’);	    #  4 (µ¥ÒýºÅ±íʾ×Ö½Ú)
	   print index("×»×»½Ì»å", "»×»½"); # -1 (²»°üº¬´Ë×Ó×Ö·û´®)
	   print index(’×»×»½Ì»å’, ’»×»½’); #  1 (´ÓµÚ¶þ¸ö×Ö½Ú¿ªÊ¼)

       ÔÚ×îºóÒ»ÁÐÀý×ÓÀï, "×»" µÄµÚ¶þ¸ö×Ö½ÚÓë "×»" µÄµÚÒ»¸ö×Ö½Ú½áºÏ³É EUC-CN
       ÂëµÄ "»×"; "×»" µÄµÚ¶þ¸ö×Ö½ÚÔòÓë "½Ì" µÄµÚÒ»¸ö×Ö½Ú½áºÏ³É "»½".
       Õâ½â¾öÁËÒÔǰ EUC-CN Âë±È¶Ô´¦ÀíÉϳ£¼ûµÄÎÊÌâ.

       ¶îÍâµÄÖÐÎıàÂë

       Èç¹ûÐèÒª¸ü¶àµÄÖÐÎıàÂë, ¿ÉÒÔ´Ó CPAN (<http://www.cpan.org/>) ÏÂÔØ
       Encode::HanExtra Ä£¿é. ËüĿǰÌṩÏÂÁбàÂ뷽ʽ:

	   gb18030     À©³ä¹ýµÄ¹ú±êÂë, °üº¬·±ÌåÖÐÎÄ

       ÁíÍâ, Encode::HanConvert Ä£¿éÔòÌṩÁ˼ò·±×ª»»ÓõÄÁ½ÖÖ±àÂë:

	   big5-simp   Big5 ·±ÌåÖÐÎÄÓë Unicode ¼òÌåÖÐÎÄ»¥×ª
	   gbk-trad    GBK ¼òÌåÖÐÎÄÓë Unicode ·±ÌåÖÐÎÄ»¥×ª

       ÈôÏëÔÚ GBK Óë Big5 Ö®¼ä»¥×ª, Çë²Î¿¼¸ÃÄ£¿éÄÚ¸½µÄ b2g.pl Óë g2b.pl
       Á½Ö§³ÌÐò, »òÔÚ³ÌÐòÄÚʹÓÃÏÂÁÐд·¨:

	   use Encode::HanConvert;
	   $euc_cn = big5_to_gb($big5); # ´Ó Big5 תΪ GBK
	   $big5 = gb_to_big5($euc_cn); # ´Ó GBK תΪ Big5

       ½øÒ»²½µÄÐÅÏ¢

       Çë²Î¿¼ Perl ÄÚ¸½µÄ´óÁ¿ËµÃ÷Îļþ (²»ÐÒÈ«ÊÇÓÃÓ¢ÎÄдµÄ), À´Ñ§Ï°¸ü¶à¹ØÓÚ
       Perl µÄ֪ʶ, ÒÔ¼° Unicode µÄʹÓ÷½Ê½. ²»¹ý, ÍⲿµÄ×ÊÔ´Ï൱·á¸»:

       Ìṩ Perl ×ÊÔ´µÄÍøÖ·


       <http://www.perl.com/>
	   Perl µÄÊ×Ò³ (ÓÉÅ·À³Àñ¹«Ë¾Î¬»¤)

       <http://www.cpan.org/>
	   Perl ×ÛºÏµä²ØÍø (Comprehensive Perl Archive Network)

       <http://lists.perl.org/>
	   Perl ÓʵÝÂÛ̳һÀÀ

       ѧϰ Perl µÄÍøÖ·


       <http://www.oreilly.com.cn/html/perl.html>
	   ¼òÌåÖÐÎİæµÄÅ·À³Àñ Perl Êé½å

       Perl ʹÓÃÕß¼¯»á


       <http://www.pm.org/groups/asia.shtml#China>
	   Öйú Perl ÍÆ¹ã×éÒ»ÀÀ

       Unicode Ïà¹ØÍøÖ·


       <http://www.unicode.org/>
	   Unicode ѧÊõѧ»á (Unicode ±ê×¼µÄÖÆ¶¨Õß)

       <http://www.cl.cam.ac.uk/%7Emgk25/unicode.html>
	   Unix/Linux É쵀 UTF-8 ¼° Unicode ´ð¿ÍÎÊ

SEE ALSO
       Encode, Encode::CN, encoding, perluniintro, perlunicode

AUTHORS
       Jarkko Hietaniemi <jhi@iki.fi>

       Autrijus Tang (ÌÆ×Úºº) <autrijus@autrijus.org>



perl v5.8.8			  2006-01-07			    PERLCN(1)