--- embedaddon/pcre/doc/pcresyntax.3 2012/02/21 23:50:25 1.1.1.2 +++ embedaddon/pcre/doc/pcresyntax.3 2013/07/22 08:25:56 1.1.1.4 @@ -1,4 +1,4 @@ -.TH PCRESYNTAX 3 +.TH PCRESYNTAX 3 "26 April 2013" "PCRE 8.33" .SH NAME PCRE - Perl-compatible regular expressions .SH "PCRE REGULAR EXPRESSION SYNTAX SUMMARY" @@ -25,7 +25,7 @@ documentation. This document contains a quick-referenc \ea alarm, that is, the BEL character (hex 07) \ecx "control-x", where x is any ASCII character \ee escape (hex 1B) - \ef formfeed (hex 0C) + \ef form feed (hex 0C) \en newline (hex 0A) \er carriage return (hex 0D) \et tab (hex 09) @@ -42,19 +42,19 @@ documentation. This document contains a quick-referenc \eC one data unit, even in UTF mode (best avoided) \ed a decimal digit \eD a character that is not a decimal digit - \eh a horizontal whitespace character - \eH a character that is not a horizontal whitespace character + \eh a horizontal white space character + \eH a character that is not a horizontal white space character \eN a character that is not a newline \ep{\fIxx\fP} a character with the \fIxx\fP property \eP{\fIxx\fP} a character without the \fIxx\fP property \eR a newline sequence - \es a whitespace character - \eS a character that is not a whitespace character - \ev a vertical whitespace character - \eV a character that is not a vertical whitespace character + \es a white space character + \eS a character that is not a white space character + \ev a vertical white space character + \eV a character that is not a vertical white space character \ew a "word" character \eW a "non-word" character - \eX an extended Unicode sequence + \eX a Unicode extended grapheme cluster .sp In PCRE, by default, \ed, \eD, \es, \eS, \ew, and \eW recognize only ASCII characters, even in a UTF mode. However, this can be changed by setting the @@ -116,6 +116,8 @@ PCRE_UCP option. Xan Alphanumeric: union of properties L and N Xps POSIX space: property Z or tab, NL, VT, FF, CR Xsp Perl space: property Z or tab, NL, FF, CR + Xuc Univerally-named character: one that can be + represented by a Universal Character Name Xwd Perl word: property Xan or underscore . . @@ -127,13 +129,16 @@ Armenian, Avestan, Balinese, Bamum, +Batak, Bengali, Bopomofo, +Brahmi, Braille, Buginese, Buhid, Canadian_Aboriginal, Carian, +Chakma, Cham, Cherokee, Common, @@ -176,7 +181,11 @@ Lisu, Lycian, Lydian, Malayalam, +Mandaic, Meetei_Mayek, +Meroitic_Cursive, +Meroitic_Hieroglyphs, +Miao, Mongolian, Myanmar, New_Tai_Lue, @@ -195,8 +204,10 @@ Rejang, Runic, Samaritan, Saurashtra, +Sharada, Shavian, Sinhala, +Sora_Sompeng, Sundanese, Syloti_Nagri, Syriac, @@ -205,6 +216,7 @@ Tagbanwa, Tai_Le, Tai_Tham, Tai_Viet, +Takri, Tamil, Telugu, Thaana, @@ -235,7 +247,7 @@ Yi. lower lower case letter print printing, including space punct printing, excluding alphanumeric - space whitespace + space white space upper upper case letter word same as \ew xdigit hexadecimal digit @@ -335,9 +347,13 @@ but some of them use Unicode properties if PCRE_UCP is The following are recognized only at the start of a pattern or after one of the newline-setting options with similar syntax: .sp + (*LIMIT_MATCH=d) set the match limit to d (decimal number) + (*LIMIT_RECURSION=d) set the recursion limit to d (decimal number) (*NO_START_OPT) no start-match optimization (PCRE_NO_START_OPTIMIZE) (*UTF8) set UTF-8 mode: 8-bit library (PCRE_UTF8) (*UTF16) set UTF-16 mode: 16-bit library (PCRE_UTF16) + (*UTF32) set UTF-32 mode: 32-bit library (PCRE_UTF32) + (*UTF) set appropriate UTF mode for the library in use (*UCP) set PCRE_UCP (use Unicode properties for \ed etc) . . @@ -432,7 +448,7 @@ pattern is not anchored. .rs .sp These are recognized only at the very start of the pattern or after a -(*BSR_...), (*UTF8), (*UTF16) or (*UCP) option. +(*BSR_...), (*UTF8), (*UTF16), (*UTF32) or (*UCP) option. .sp (*CR) carriage return only (*LF) linefeed only @@ -479,6 +495,6 @@ Cambridge CB2 3QH, England. .rs .sp .nf -Last updated: 10 January 2012 -Copyright (c) 1997-2012 University of Cambridge. +Last updated: 26 April 2013 +Copyright (c) 1997-2013 University of Cambridge. .fi