Annotation of embedaddon/php/ext/mbstring/README_PHP3-i18n-ja, revision 1.1.1.2

1.1       misho       1: ==========================================
                      2:   README for I18N Package
                      3: ==========================================
                      4: 
                      5: o Name and location of package
                      6: 
                      7: Name:           php-3.0.18-i18n-ja-2
                      8: Location:       http://www.happysize.co.jp/techie/php-ja-jp/
                      9:                 ftp://ftp.happysize.co.jp/php-ja-jp/
                     10:                 http://php.vdomains.org/
                     11:                 ftp://ftp.vdomains.org/pub/php-ja-jp/
                     12:                 http://php.jpnnet.com/
                     13: 
                     14: Currently, this I18N version of PHP only adds Japanese support to base
                     15: PHP.  It allows you to use Japanese in scripts, as well as conversion
                     16: between various Japanese encodings.  It will work perfectly fine with 
                     17: ASCII with i18n option enabled.  (note: executable is bit larger due 
                     18: to UNICODE table).  The basic design aproach is to allow for other 
                     19: languages to be added in the future.  Developers are encourage to join
                     20: us!
                     21: 
                     22: For more information on Japanese encodings, please refer to the 
                     23: section "Additional Notes."
                     24: 
                     25: 
                     26: o What is this package?
                     27: 
                     28: This package allows you to handle multiple Japanese encodings (SJIS, EUC,
                     29: UTF-8, JIS) in PHP.  If you find any bugs in this package, please report
                     30: them to the appropriate mailing list.  For now, the PHP-jp mailing list 
                     31: is the best place for this.
                     32: 
                     33: PHP-jp ML       mailto:PHP-jp@sidecar.ics.es.osaka-u.ac.jp
                     34:                 http://sidecar.ics.es.osaka-u.ac.jp/php-jp/
                     35:                 (discussions are in Japanese)
                     36: 
                     37: 
                     38: o Who should use this
                     39: 
                     40: Due to lack of documentation, it's not intended for beginners.  If
                     41: something goes wrong, be prepared to fix it on your own.
                     42: 
                     43: 
                     44: o Warranty and Copyright
                     45: 
                     46: There is no warranty with this package.  Use it at your own risk.
                     47: 
                     48: Please refer to the source code for the copyrights.  In general, each
                     49: program's copyright is owned by the programmer.  Unless you obey the
                     50: copyright holders restrictions, you are not allowed to use it in any
                     51: form.
                     52: 
                     53: 
                     54: o Redistribution
                     55: 
                     56: As described in the source code, this package and the components are
                     57: allowed to be redistributed with certain restrictions.
                     58: 
                     59: Due to this package being still in beta, please try to redistribute
                     60: it as an entire package.  Please try not to distribute it as a form
                     61: of patch.  Because we would prefer to have this package distributed
                     62: as one single package (not patch of patch of patch), avoid releasing
                     63: any patch to this package.
                     64: 
                     65: 
                     66: o Who made this
                     67: 
                     68: A team of volunteers, PHP3 Internationalization, has been contributing
                     69: their free time producing it.  Although we are not related to the core
                     70: PHP programmers, we are hoping to have our modifications merged into the
                     71: core distribution in the near future.  Thus, we did not call this a
                     72: "Japanese Patch" (or distribution).  Our final goal is to have true
                     73: i18nized PHP!
                     74: 
                     75: For anyone interested in this project, please drop us a line.
                     76: 
                     77: Contact Address:
                     78:         phpj-dev@kage.net
                     79:         (Discussions are in Japanese, but feel free to write us in English)
                     80: 
                     81: Webpage (English and Japanese):
                     82:         http://php.jpnnet.com/
                     83: 
                     84: Project Outline (Japanese):
                     85:         http://www.happysize.co.jp/techie/php-ja-jp/spec.htm
                     86: 
                     87: Developers:
                     88:         Hironori Sato <satoh@jpnnet.com>
                     89:         Shigeru Kanemoto <sgk@happysize.co.jp>
                     90:         Tsukada Takuya <tsukada@fminn.nagano.nagano.jp>
                     91:         U. Kenkichi <kenkichi@axes.co.jp>
                     92:         Tateyama  <tateyan@amy.hi-ho.ne.jp>
                     93:         Other gracious contributors
                     94: 
                     95: 
                     96: o Future plans
                     97: 
                     98: - fulfilling what's written in outline
                     99: - support for other languages other than Japanese
                    100: - make the character conversion as a library (?)
                    101: - more testing
                    102: 
                    103: 
                    104: o Special Thanks to
                    105: 
                    106: PHP Japanese webpage maintainer, Hirokawa-san
                    107:         http://www.cityfujisawa.ne.jp/%7Elouis/apps/phpfi/
                    108: PHP-JP ML's Yamamoto-san
                    109:         http://sidecar.ics.es.osaka-u.ac.jp/php-jp/
                    110: Previous jp-patch developers
                    111: 
                    112: 
                    113: 
                    114: ==========================================
                    115:   Advantages of using I18N package
                    116: ==========================================
                    117: 
                    118: - allows you to use various character encodings for script files and 
                    119:   http output
                    120: - distinguish character encoding in POST/GET/COOKIE
                    121: - proper mail output using JIS as body and MIME/Base64/JIS subject
                    122: - if http output's Content-Type is text/html, it will set proper charset
                    123: - stable character encoding conversion
                    124: - multibyte regex
                    125: 
                    126: 
                    127: 
                    128: ==========================================
                    129:   Installation
                    130: ==========================================
                    131: 
                    132: o Summary
                    133: 
                    134: Add --enable-i18n option when running configure.  For your own setup,
                    135: add any other appropriate options as well.
                    136: 
                    137: Don't forget to copy php3.ini-dist to desired location.
                    138: (ex. /usr/local/lib/php3.ini)
                    139: 
                    140: If you have already installed PHP3, copy all the entries in php3.ini-dist
                    141: which start with "i18n.xxxx" to php3.ini.
                    142: 
                    143: 
                    144: o configure option
                    145:     --enable-i18n
                    146:       include i18n features
                    147: 
                    148:     --enable-mbregex
                    149:       include multibyte regex library
                    150:       (without i18n enabled, mbregex functions will not function)
                    151: 
                    152: 
                    153: o creating cgi version
                    154: 
                    155:     % tar xvzf php-3.0.18-i18n-ja-2.tar.gz
                    156:     % cd php-3.0.18-i18n-ja-2
                    157:     % ./configure --enable-i18n --enable-mbregex
                    158:     % make
                    159: 
                    160: 
                    161: o creating Apache version (regular module)
                    162: 
                    163:     % tar xvzf php-3.0.18-i18n-ja-2.tar.gz
                    164:     % tar xvzf apache_1.3.x.tar.gz
                    165:     % cd apache_1.3.x
                    166:     % ./configure
                    167:     % cd ../php-3.0.18-i18n-ja-2
                    168:     % ./configure --with-apache=../apache_1.3.x --enable-i18n --enable-mbregex
                    169:     % make
                    170:     % make install
                    171:     % cd ../apache_1.3.x
                    172:     % ./configure --activate-module=src/modules/php3/libphp3.a
                    173:     % make
                    174:     % make install
                    175: 
                    176: 
                    177: o creating Apache DSO version
                    178: 
                    179:     create DSO capable Apache first
                    180:     % tar xvzf apache_1.3.x.tar.gz
                    181:     % cd apache-1.3.x
                    182:     % ./configure --enable-shared=max
                    183:     % make
                    184:     % make install
                    185: 
                    186:     now create php3
                    187:     % cd php-3.0.18-i18n-ja-2
                    188:     % ./configure --with-apxs=/usr/local/apache/bin/apxs --enable-i18n \
                    189:         --enable-mbregex
                    190:     % make
                    191:     % make install
                    192: 
                    193: 
                    194: ==========================================
                    195:   Additional Notes
                    196: ==========================================
                    197: 
                    198: o Multibyte regex library
                    199: 
                    200: From beta4, we have included the multibyte (mb) regex library which comes with
                    201: Ruby.  With this addition, you can now use regex in EUC, SJIS and UTF-8
                    202: encoding.  To avoid any conflicts with HSREGEX included with Apache,
                    203: each function name has been changed.  Therefore, mb regex functions are
                    204: named differently from the original ereg functions in PHP.  The character
                    205: encoding used in mb regex is configured in i18n.internal_encoding.
                    206: 
                    207: 
                    208: o Binary Output
                    209: 
                    210: If http output encoding is set to other than 'pass', conversion of encoding
                    211: from internal encoding to http output is done automatically.  Thus,
                    212: if you prefer to spit out anything in raw binary format, your data
                    213: may be corrupted.  In such event, set http_output to 'pass'.
                    214: 
                    215: ex.
                    216:         <?
                    217:             i18n_http_output("pass");
                    218:             ...
                    219:             echo $the_binary_data_string;
                    220:         ?>
                    221: 
                    222: 
                    223: o Content-Type
                    224: 
                    225: Depending on the setting of http_output, PHP will output the proper charset.
                    226: ex. Content-Type: text/html; charset="..."
                    227: 
                    228: Be aware of following:
                    229: 
                    230: - If you set Content-Type header using header() function, that will
                    231:   override the automatic addition of charset.
                    232: - Be cautious when you set i18n_http_output, since if any output is
                    233:   made prior to this, proper header may have been sent out to the
                    234:   client already.
                    235: 
                    236: 
                    237: o In the event of trouble
                    238: 
                    239: If you find any bugs or trouble, please contact us at the above address.
                    240: It may help us to track the problem if you send us the script as well.
                    241: 
                    242: If you encounter any memory related error such as segmentation violation,
                    243: add --enable-debug when you run configure.  This will give you more
                    244: detail information on where error has occurred.  The error is stored
                    245: in the server log or regular http output in CGI mode.
                    246: 
                    247: 
                    248: o About Japanese encodings
                    249: 
                    250: Due to historical reason, there are multiple character encodings used
                    251: for Japanese.  The most common encodings are: SJIS, EUC, JIS, and UTF-8.  
                    252: Here are (very) brief description of them:
                    253: 
                    254: EUC
                    255:   commonly used in UNIX environment
                    256:   8bit-8bit combo
                    257:   always >=0x80
                    258: 
                    259: SJIS
                    260:   commonly used in Mac or PCs
                    261:   similar to EUC
                    262:   mostly 8bit-8bit (some 8bit-7bit)
                    263:   mostly >=0x80
                    264:   there are some halfwidth (size of ASCII) multibytes
                    265: 
                    266: JIS
                    267:   commonly used in 7bit environment (nntp and smtp)
                    268:   starts with escaping char, \033 and a few more characters
                    269: 
                    270: UTF-8
                    271:   16bit+ encoding
                    272:   defines many languages existing in this world
                    273:   see http://www.unicode.org/ for more detail
                    274: 
                    275: Because of having all these character encodings, PHP needs to translate
                    276: between these encodings on the fly.  Also, the addition of the mb regex 
                    277: library allows you to handle mb strings without fear of getting mb char 
                    278: chopped in half.
                    279: 
                    280: Since Japanese is not the only language with multiple encodings, we
                    281: encourage other developers to modify our code to suit your needs.  We
                    282: definitely need people to work with Korean, Chinese (both traditional
                    283: and simplified), and Russian.  Let us know if you are interested in
                    284: this project!
                    285: 
                    286: 
                    287: 
                    288: ==========================================
                    289:   php3.ini setting
                    290: ==========================================
                    291: 
                    292: The following init options will allow you to change the default settings.
                    293: Define these settings in the global section of php3.ini.
                    294: 
                    295: All keywords are case-insensitive.
                    296: 
                    297: o Encoding naming
                    298: 
                    299:     For each encoding, there are three names: standarized, alias, MIME
                    300: 
                    301:     - UTF-8
                    302:          standard: UTF-8
                    303:          alias: N/A 
                    304:          mime: UTF-8
                    305: 
                    306:     - ASCII
                    307:          standard: ASCII
                    308:          alias: N/A
                    309:          mime: US-ASCII
                    310: 
                    311:     - Japanese EUC
                    312:          standard: EUC-JP
                    313:          alias: EUC, EUC_JP, eucJP, x-euc-jp
                    314:          mime: EUC-JP
                    315: 
                    316:     - Shift JIS
                    317:          standard: SJIS
                    318:          alias: x-sjis, MS_Kanji
                    319:          mime: Shift_JIS
                    320: 
                    321:     - JIS
                    322:          standard: JIS
                    323:          alias: N/A 
                    324:          mime: ISO-2022-JP
                    325: 
                    326:     - Quoted-Printable
                    327:          standard: Quoted-Printable
                    328:          alias: qprint
                    329:          mime: N/A
                    330: 
                    331:     - BASE64
                    332:          standard: BASE64
                    333:          alias: N/A
                    334:          mime: N/A
                    335: 
                    336:     - no conversion
                    337:          standard: pass
                    338:          alias: none
                    339:          mime: N/A
                    340: 
                    341:     - auto encoding detection
                    342:          standard: auto
                    343:          alias: unknown
                    344:          mime: N/A
                    345: 
                    346:     * N/A - Not Applicapable
                    347: 
                    348: o i18n.http_output - default http output encoding
                    349: 
                    350:     i18n.http_output = EUC-JP|SJIS|JIS|UTF-8|pass
                    351:         EUC-JP : EUC
                    352:         SJIS: SJIS
                    353:         JIS : JIS
                    354:         UTF-8: UTF-8
                    355:         pass: no conversion
                    356: 
                    357:     The default is pass (internal encoding is used)
                    358:     It can be re-configured on the fly using i18n_http_output().
                    359: 
                    360: 
                    361: o i18n.internal_encoding - internal encoding
                    362: 
                    363:     i18n.internal_encoding = EUC-JP|SJIS|UTF-8
                    364:         EUC-JP : EUC
                    365:         SJIS: SJIS
                    366:         UTF-8: UTF-8
                    367: 
                    368:     The default is EUC-JP.  
                    369: 
                    370:     PHP parser is designed based on using ISO-8859-1.  For other
                    371:     encodings, following conditions have to be satisfied in order
                    372:     to use them:
                    373:        - per byte encoding
                    374:        - single byte charactor in range of 00h-7fh which is compatible 
                    375:          with ASCII
                    376:        - multibyte without 00h-7fh
                    377:     In case of Japanese, EUC-JP and UTF-8 are the only encoding that
                    378:     meets this criteria.
                    379: 
                    380:     If i18n.internal_encoding and i18n.http_output differs, conversion
                    381:     takes place at the time of output.  If you convert any data within
                    382:     PHP scripts to URL encoding, BASE64 or Quoted-Printable, encoding
                    383:     stays as defined in i18n.internal_encoding.  Thus, if you would
                    384:     prefer to encode in compliance with i18n.http_output, you need
                    385:     to manually convert encoding.
                    386: 
                    387:     ex. $str = urlencode( i18n_convert($str, i18n_http_output()) );
                    388: 
                    389:     Encoding such as ISO-2022-** and HZ encoding which uses escape
                    390:     sequences can not be used as internal encoding.  If used, they
                    391:     result in following errors:
                    392:        - parser pukes funky error
                    393:        - magic_quotes_*** breaks encoding (SJIS may have similar problem)
                    394:        - string manipulation and regex will malfunction
                    395: 
                    396: 
                    397: o i18n.script_encoding - script encoding
                    398: 
                    399:     i18n.script_encoding = auto|EUC-JP|SJIS|JIS|UTF-8
                    400:         auto: automatic
                    401:         EUC-JP : EUC
                    402:         SJIS: SJIS
                    403:         JIS : JIS
                    404:         UTF-8: UTF-8
                    405: 
                    406:     The default is auto.
                    407:     The script's encoding is converted to i18n.internal_encoding before
                    408:     entering the script parser.
                    409: 
                    410:     Be aware that auto detection may fail under some conditions.
1.1.1.2 ! misho     411:     For best auto detection, add multibyte charactor at beginning of
1.1       misho     412:     script.
                    413: 
                    414: 
                    415: o i18n.http_input - handling of http input (GET/POST/COOKIE)
                    416: 
                    417:     i18n.http_input = pass|auto
                    418:         auto: auto conversion
                    419:         pass: no conversion
                    420: 
                    421:     The default is auto.
                    422:     If set to pass, no conversion will take place.
                    423:     If set to auto, it will automatically detect the encoding.  If
                    424:     detection is successful, it will convert to the proper internal
                    425:     encoding.  If not, it will assume the input as defined in
                    426:     i18n.http_input_default.
                    427: 
                    428: o i18n.http_input_default - default http input encoding
                    429: 
                    430:     i18n.http_input_default = pass|EUC-JP|SJIS|JIS|UTF-8
                    431:         pass: no conversion
                    432:         EUC-JP : EUC
                    433:         SJIS: SJIS
                    434:         JIS : JIS
                    435:         UTF-8: UTF-8
                    436: 
                    437:     The default is pass.
                    438:     This option is only effective as long as i18n.http_input is set to
                    439:     auto.  If the auto detection fails, this encoding is used as an
                    440:     assumption to convert the http input to the internal encoding.
                    441:     If set to pass, no conversion will take place.
                    442: 
                    443: o sample settings
                    444: 
                    445:     1) For most flexibility, we recommend using following example.
                    446:          i18n.http_output = SJIS
                    447:          i18n.internal_encoding = EUC-JP
                    448:          i18n.script_encoding = auto
                    449:          i18n.http_input = auto
                    450:          i18n.http_input_default = SJIS
                    451: 
                    452:     2) To avoid unexpected encoding problems, try these:
                    453: 
                    454:          i18n.http_output = pass
                    455:          i18n.internal_encoding = EUC-JP
                    456:          i18n.script_encoding = pass
                    457:          i18n.http_input = pass
                    458:          i18n.http_input_default = pass
                    459: 
                    460: 
                    461: 
                    462: ==========================================
                    463:   PHP functions
                    464: ==========================================
                    465: 
                    466: The following describes the additional PHP functions.
                    467: 
                    468: All keywords are case-insensitive.
                    469: 
                    470: o i18n_http_output(encoding)
                    471: o encoding = i18n_http_output()
                    472: 
                    473:     This will set the http output encoding.  Any output following this
                    474:     function will be controlled by this function.  If no argument is given,
                    475:     the current http output encode setting is returned.
                    476: 
                    477:     encodings
                    478:         EUC-JP : EUC
                    479:         SJIS: SJIS
                    480:         JIS : JIS
                    481:         UTF-8: UTF-8
                    482:         pass: no conversion
                    483: 
                    484:     NONE is not allowed
                    485: 
                    486: 
                    487: o encoding = i18n_internal_encoding()
                    488: 
                    489:     Returns the current internal encoding as a string.
                    490: 
                    491:     internal encoding
                    492:         EUC-JP : EUC
                    493:         SJIS: SJIS
                    494:         UTF-8: UTF-8
                    495: 
                    496: 
                    497: o encoding = i18n_http_input()
                    498: 
                    499:     Returns http input encoding.
                    500: 
                    501:     encodings
                    502:         EUC-JP : EUC
                    503:         SJIS: SJIS
                    504:         JIS : JIS
                    505:         UTF-8: UTF-8
                    506:         pass: no conversion (only if i18n.http_input is set to pass)
                    507: 
                    508: 
                    509: o string = i18n_convert(string, encoding)
                    510:   string = i18n_convert(string, encoding, pre-conversion-encoding)
                    511: 
                    512:     Returns converted string in desired encoding.  If
                    513:     pre-conversion-encoding is not defined, the given 
                    514:     string is assumed to be in internal encoding.
                    515: 
                    516:     encoding
                    517:         EUC-JP : EUC
                    518:         SJIS: SJIS
                    519:         JIS : JIS
                    520:         UTF-8: UTF-8
                    521:         pass: no conversion
                    522: 
                    523:     pre-conversion-encoding
                    524:         EUC-JP : EUC
                    525:         SJIS: SJIS
                    526:         JIS : JIS
                    527:         UTF-8: UTF-8
                    528:         pass: no conversion
                    529:         auto: auto detection
                    530: 
                    531: 
                    532: o encoding = i18n_discover_encoding(string)
                    533: 
                    534:     Encoding of the given string is returned (as a string).
                    535: 
                    536:     encoding
                    537:         EUC-JP : EUC
                    538:         SJIS: SJIS
                    539:         JIS : JIS
                    540:         UTF-8: UTF-8
                    541:         ASCII: ASCII (only 09h, 0Ah, 0Dh, 20h-7Eh)
                    542:         pass: unable to determine (text is too short to determine)
                    543:         unknown: unknown or possible error
                    544: 
                    545: 
                    546: o int = mbstrlen(string)
                    547: o int = mbstrlen(string, encoding)
                    548: 
                    549:     Returns character length of a given string.  If no encoding is defined,
                    550:     the encoding of string is assumed to be the internal encoding.
                    551: 
                    552:     encoding
                    553:         EUC-JP : EUC
                    554:         SJIS: SJIS
                    555:         JIS : JIS
                    556:         UTF-8: UTF-8
                    557:         auto: automatic
                    558: 
                    559: 
                    560: o int = mbstrpos(string1, string2)
                    561: o int = mbstrpos(string1, string2, start)
                    562: o int = mbstrpos(string1, string2, start, encoding)
                    563: 
                    564:     Same as strpos.  If no encoding is defined, the encoding of string
                    565:     is assumed to be the internal encoding.
                    566: 
                    567:     encoding
                    568:         EUC-JP : EUC
                    569:         SJIS: SJIS
                    570:         JIS : JIS
                    571:         UTF-8: UTF-8
                    572: 
                    573: 
                    574: o int = mbstrrpos(string1, string2)
                    575: o int = mbstrrpos(string1, string2, encoding)
                    576: 
                    577:     Same as strrpos.  If no encoding is defined, the encoding of string
                    578:     is assumed to be the internal encoding.
                    579: 
                    580:     encoding
                    581:         EUC-JP : EUC
                    582:         SJIS: SJIS
                    583:         JIS : JIS
                    584:         UTF-8: UTF-8
                    585: 
                    586: 
                    587: o string = mbsubstr(string, position)
                    588: o string = mbsubstr(string, position, length)
                    589: o string = mbsubstr(string, position, length, encoding)
                    590: 
                    591:     Same as substr.  If no encoding is defined, the encoding of string
                    592:     is assumed to be the internal encoding.
                    593: 
                    594:     encoding
                    595:         EUC-JP : EUC
                    596:         SJIS: SJIS
                    597:         JIS : JIS
                    598:         UTF-8: UTF-8
                    599: 
                    600: 
                    601: o string = mbstrcut(string, position)
                    602: o string = mbstrcut(string, position, length)
                    603: o string = mbstrcut(string, position, length, encoding)
                    604: 
                    605:     Same as subcut.  If position is the 2nd byte of a mb character, it will cut
                    606:     from the first byte of that character.  It will cut the string without
                    607:     chopping a single byte from a mb character.  In another words, if you
                    608:     set length to 5, you will only get two mb characters.  If no encoding
                    609:     is defined, the encoding of string is assumed to be the internal encoding.
                    610: 
                    611:     encoding
                    612:         EUC-JP : EUC
                    613:         SJIS: SJIS
                    614:         JIS : JIS
                    615:         UTF-8: UTF-8
                    616: 
                    617: 
                    618: o string = i18n_mime_header_encode(string)
                    619:     MIME encode the string in the format of =?ISO-2022-JP?B?[string]?=.
                    620: 
                    621: 
                    622: o string = i18n_mime_header_decode(string)
                    623:     MIME decodes the string.
                    624: 
                    625: 
                    626: o string = i18n_ja_jp_hantozen(string)
                    627: o string = i18n_ja_jp_hantozen(string, option)
                    628: o string = i18n_ja_jp_hantozen(string, option, encoding)
                    629: 
                    630:     Conversion between full width character and halfwidth character.
                    631: 
                    632:     option
                    633:     The following options are allowed.  The default is "KV".
                    634:     Acronym: FW = fullwidth, HW = halfwidth
                    635: 
                    636:     "r" :  FW alphabet -> HW alphabet
                    637: 
                    638:     "R" :  HW alphabet -> FW alphabet
                    639: 
                    640:     "n" :  FW number -> HW number
                    641: 
                    642:     "N" :  HW number -> FW number
                    643: 
                    644:     "a" :  FW alpha numeric (21h-7Eh) -> HW alpha numeric
                    645: 
                    646:     "A" :  HW alpha numeric (21h-7Eh) -> FW alpha numeric
                    647: 
                    648:     "k" :  FW katakana -> HW katakana
                    649: 
                    650:     "K" :  HW katakana -> FW katakana
                    651: 
                    652:     "h" :  FW hiragana -> HW hiragana
                    653: 
                    654:     "H" :  HW hiragana -> FW katakana
                    655: 
                    656:     "c" :  FW katakana -> FW hiragana
                    657: 
                    658:     "C" :  FW hiragana -> FW katakana
                    659: 
                    660:     "V" :  merge dakuon character.  only works with "K" and "H" option
                    661: 
                    662:     encoding
                    663:     If no encoding is defined, the encoding of string is assumed to be
                    664:     the internal encoding.
                    665:         EUC-JP : EUC
                    666:         SJIS: SJIS
                    667:         JIS : JIS
                    668:         UTF-8: UTF-8
                    669: 
                    670: 
                    671: int = mbereg(regex_pattern, string, string)
                    672: int = mberegi(regex_pattern, string, string)
                    673:     mb version of ereg() and eregi()
                    674: 
                    675: 
                    676: string = mbereg_replace(regex_pattern, string, string)
                    677: string = mberegi_replace(regex_pattern, string, string)
                    678:     mb version of ereg_replace() and eregi_replace()
                    679: 
                    680: 
                    681: string_array = mbsplit(regex, string, limit)
                    682:     mb version of split()
                    683: 
                    684: 
                    685: 
                    686: ==========================================
                    687:   FAQ
                    688: ==========================================
                    689: 
                    690: Here, we have gathered some commonly asked questions on PHP-jp mailing
                    691: list.
                    692: 
                    693: o To use Japanese in GET method
                    694: 
                    695: If you need to assign Japanese text in GET method with argument, such as;
                    696: xxxx.php?data=<Japanese text>, use urlencode function in PHP.  If not,
                    697: text may not be passed onto action php properly.
                    698: 
                    699: ex: <a href="hoge.php?data=<? echo urlencode($data) ?>">Link</a>
                    700: 
                    701: 
                    702: o When passing data via GET/POST/COOKIE, \ character sneaks in
                    703: 
                    704: When using SJIS as internal encoding, or passed-on data includes '"\, 
                    705: PHP automatically inserts escaping character, \.  Set magic_quotes_gpc
                    706: in php3.ini from On to Off.  An alternative work around to this problem 
                    707: is to use StripSlashes().
                    708: 
                    709: If $quote_str is in SJIS and you would like to extract Japanese text,
                    710: use ereg_replace as follows:
                    711: 
                    712: ereg_replace(sprintf("([%c-%c%c-%c]\\\\)\\\\",0x81,0x9f,0xe0,0xfc),
                    713:        "\\1",$quote_str);
                    714: 
                    715: This will effectively extract Japanese text out of $quote_str.
                    716: 
                    717: 
                    718: o Sometimes, encoding detection fails
                    719: 
                    720: If i18n_http_input() returns 'pass', it's likely that PHP failed to
                    721: detect whether it's SJIS or EUC.  In such case, use <input type=hidden
                    722: value="some Japanese text"> to properly detect the incoming text's 
                    723: encoding.
                    724: 
                    725: 
                    726: 
                    727: ==========================================
                    728:   Japanese Manual
                    729: ==========================================
                    730: Translated manual done by "PHP Japanese Manual Project" :
                    731: 
                    732: http://www.php.net/manual/ja/manual.php
                    733: 
                    734: Starting 3.0.18-i18n-ja, we have removed doc-jp from tarball package.
                    735: 
                    736: 
                    737: ==========================================
                    738:   Change Logs
                    739: ==========================================
                    740: 
                    741: o 2000-10-28, Rui Hirokawa <hirokawa@php.net>
                    742: 
                    743: This patch is derived from php-3.0.15-i18n-ja as well as php-3.0.16 by 
                    744: Kuwamura applied to original php-3.0.18.  It also includes following fixes:
                    745: 
                    746: 1) allows you to set charset in mail().
                    747: 2) fixed mbregex definitions to avoid conflicts with system regex
                    748: 3) php3.ini-dist now uses PASS for http_output instead of SJIS
                    749: 
                    750: o 2000-11-24, Hironori Sato <satoh@yyplanet.com>
                    751: 
                    752: Applied above patched and added detection for gdImageStringTTF in configure.
                    753: Following setups are known to work:
                    754: 
                    755: gd-1.3-6, gd-devel-1.3-6, freetype-1.3.1-5, freetype-devel-1.3.1-5
                    756:     ImageTTFText($im,$size,$angle,$x1,$y1,$color,"/path/to/font.ttf",
                    757:         i18n_convert("日本語", "UTF-8"));
                    758:     ImageGif($im);
                    759: 
                    760: gd-1.7.3-1k1, gd-devel-1.7.3-1k1, freetype-1.3.1-5, freetype-devel-1.3.1-5
                    761:     ImageTTFText($im,$size,$angle,$x1,$y1,$color,"/path/to/font.ttf","日本語");
                    762:     ImagePng($im);
                    763:     * i18n_internal_encoding = EUC 又は SJIS
                    764: 
                    765: For any gd libraries before 1.6.2, you need to use i18n_convert.  For
                    766: gd-1.5.2/3, upgrade to anything above 1.7 to use ImageTTFText without
                    767: using i18n_convert.  As long as you have internal_encoding set to EUC or
                    768: SJIS, ImageTTFText should work without mojibake.  Again, make sure you 
                    769: have i18n_http_output("pass") before calling ImageGif, ImagePng, ImageJpeg!
                    770: 
                    771: o 2000-12-09, Rui Hirokawa <hirokawa@php.net>
                    772: 
                    773: Fixed mail() which was causing segmentation fault when header was null.
                    774: 

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>