version 1.1.1.3, 2012/10/09 09:19:17
|
version 1.1.1.4, 2013/07/22 08:25:56
|
Line 1
|
Line 1
|
.TH PCRESYNTAX 3 "10 January 2012" "PCRE 8.30" | .TH PCRESYNTAX 3 "26 April 2013" "PCRE 8.33" |
.SH NAME |
.SH NAME |
PCRE - Perl-compatible regular expressions |
PCRE - Perl-compatible regular expressions |
.SH "PCRE REGULAR EXPRESSION SYNTAX SUMMARY" |
.SH "PCRE REGULAR EXPRESSION SYNTAX SUMMARY" |
Line 54 documentation. This document contains a quick-referenc
|
Line 54 documentation. This document contains a quick-referenc
|
\eV a character that is not a vertical white space character |
\eV a character that is not a vertical white space character |
\ew a "word" character |
\ew a "word" character |
\eW a "non-word" character |
\eW a "non-word" character |
\eX an extended Unicode sequence | \eX a Unicode extended grapheme cluster |
.sp |
.sp |
In PCRE, by default, \ed, \eD, \es, \eS, \ew, and \eW recognize only ASCII |
In PCRE, by default, \ed, \eD, \es, \eS, \ew, and \eW recognize only ASCII |
characters, even in a UTF mode. However, this can be changed by setting the |
characters, even in a UTF mode. However, this can be changed by setting the |
Line 116 PCRE_UCP option.
|
Line 116 PCRE_UCP option.
|
Xan Alphanumeric: union of properties L and N |
Xan Alphanumeric: union of properties L and N |
Xps POSIX space: property Z or tab, NL, VT, FF, CR |
Xps POSIX space: property Z or tab, NL, VT, FF, CR |
Xsp Perl space: property Z or tab, NL, FF, CR |
Xsp Perl space: property Z or tab, NL, FF, CR |
|
Xuc Univerally-named character: one that can be |
|
represented by a Universal Character Name |
Xwd Perl word: property Xan or underscore |
Xwd Perl word: property Xan or underscore |
. |
. |
. |
. |
Line 345 but some of them use Unicode properties if PCRE_UCP is
|
Line 347 but some of them use Unicode properties if PCRE_UCP is
|
The following are recognized only at the start of a pattern or after one of the |
The following are recognized only at the start of a pattern or after one of the |
newline-setting options with similar syntax: |
newline-setting options with similar syntax: |
.sp |
.sp |
|
(*LIMIT_MATCH=d) set the match limit to d (decimal number) |
|
(*LIMIT_RECURSION=d) set the recursion limit to d (decimal number) |
(*NO_START_OPT) no start-match optimization (PCRE_NO_START_OPTIMIZE) |
(*NO_START_OPT) no start-match optimization (PCRE_NO_START_OPTIMIZE) |
(*UTF8) set UTF-8 mode: 8-bit library (PCRE_UTF8) |
(*UTF8) set UTF-8 mode: 8-bit library (PCRE_UTF8) |
(*UTF16) set UTF-16 mode: 16-bit library (PCRE_UTF16) |
(*UTF16) set UTF-16 mode: 16-bit library (PCRE_UTF16) |
|
(*UTF32) set UTF-32 mode: 32-bit library (PCRE_UTF32) |
|
(*UTF) set appropriate UTF mode for the library in use |
(*UCP) set PCRE_UCP (use Unicode properties for \ed etc) |
(*UCP) set PCRE_UCP (use Unicode properties for \ed etc) |
. |
. |
. |
. |
Line 442 pattern is not anchored.
|
Line 448 pattern is not anchored.
|
.rs |
.rs |
.sp |
.sp |
These are recognized only at the very start of the pattern or after a |
These are recognized only at the very start of the pattern or after a |
(*BSR_...), (*UTF8), (*UTF16) or (*UCP) option. | (*BSR_...), (*UTF8), (*UTF16), (*UTF32) or (*UCP) option. |
.sp |
.sp |
(*CR) carriage return only |
(*CR) carriage return only |
(*LF) linefeed only |
(*LF) linefeed only |
Line 489 Cambridge CB2 3QH, England.
|
Line 495 Cambridge CB2 3QH, England.
|
.rs |
.rs |
.sp |
.sp |
.nf |
.nf |
Last updated: 10 January 2012 | Last updated: 26 April 2013 |
Copyright (c) 1997-2012 University of Cambridge. | Copyright (c) 1997-2013 University of Cambridge. |
.fi |
.fi |