--- embedaddon/pcre/doc/html/pcresyntax.html 2012/02/21 23:05:52 1.1 +++ embedaddon/pcre/doc/html/pcresyntax.html 2013/07/22 08:25:57 1.1.1.4 @@ -46,8 +46,7 @@ man page, in case the conversion went wrong. The full syntax and semantics of the regular expressions that are supported by PCRE are described in the pcrepattern -documentation. This document contains just a quick-reference summary of the -syntax. +documentation. This document contains a quick-reference summary of the syntax.
@@ -62,7 +61,7 @@ syntax. \a alarm, that is, the BEL character (hex 07) \cx "control-x", where x is any ASCII character \e escape (hex 1B) - \f formfeed (hex 0C) + \f form feed (hex 0C) \n newline (hex 0A) \r carriage return (hex 0D) \t tab (hex 09) @@ -76,25 +75,25 @@ syntax.
. any character except newline; in dotall mode, any character whatsoever - \C one byte, even in UTF-8 mode (best avoided) + \C one data unit, even in UTF mode (best avoided) \d a decimal digit \D a character that is not a decimal digit - \h a horizontal whitespace character - \H a character that is not a horizontal whitespace character + \h a horizontal white space character + \H a character that is not a horizontal white space character \N a character that is not a newline \p{xx} a character with the xx property \P{xx} a character without the xx property \R a newline sequence - \s a whitespace character - \S a character that is not a whitespace character - \v a vertical whitespace character - \V a character that is not a vertical whitespace character + \s a white space character + \S a character that is not a white space character + \v a vertical white space character + \V a character that is not a vertical white space character \w a "word" character \W a "non-word" character - \X an extended Unicode sequence + \X a Unicode extended grapheme clusterIn PCRE, by default, \d, \D, \s, \S, \w, and \W recognize only ASCII -characters, even in UTF-8 mode. However, this can be changed by setting the +characters, even in a UTF mode. However, this can be changed by setting the PCRE_UCP option.
+ (*LIMIT_MATCH=d) set the match limit to d (decimal number) + (*LIMIT_RECURSION=d) set the recursion limit to d (decimal number) (*NO_START_OPT) no start-match optimization (PCRE_NO_START_OPTIMIZE) - (*UTF8) set UTF-8 mode (PCRE_UTF8) + (*UTF8) set UTF-8 mode: 8-bit library (PCRE_UTF8) + (*UTF16) set UTF-16 mode: 16-bit library (PCRE_UTF16) + (*UTF32) set UTF-32 mode: 32-bit library (PCRE_UTF32) + (*UTF) set appropriate UTF mode for the library in use (*UCP) set PCRE_UCP (use Unicode properties for \d etc)@@ -439,6 +455,7 @@ The following act immediately they are reached:
(*ACCEPT) force successful match (*FAIL) force backtrack; synonym (*F) + (*MARK:NAME) set name to be passed back; synonym (*:NAME)The following act only when a subsequent match failure causes a backtrack to reach them. They all force a match failure, but they differ in what happens @@ -447,14 +464,18 @@ pattern is not anchored.
(*COMMIT) overall failure, no advance of starting point (*PRUNE) advance to next starting character - (*SKIP) advance start to current matching position + (*PRUNE:NAME) equivalent to (*MARK:NAME)(*PRUNE) + (*SKIP) advance to current matching position + (*SKIP:NAME) advance to position corresponding to an earlier + (*MARK:NAME); if not found, the (*SKIP) is ignored (*THEN) local failure, backtrack to next alternation + (*THEN:NAME) equivalent to (*MARK:NAME)(*THEN)
These are recognized only at the very start of the pattern or after a -(*BSR_...) or (*UTF8) or (*UCP) option. +(*BSR_...), (*UTF8), (*UTF16), (*UTF32) or (*UCP) option.
(*CR) carriage return only (*LF) linefeed only @@ -466,7 +487,7 @@ These are recognized only at the very start of the pat
WHAT \R MATCHES
These are recognized only at the very start of the pattern or after a -(*...) option that sets the newline convention or UTF-8 or UCP mode. +(*...) option that sets the newline convention or a UTF or UCP mode.
(*BSR_ANYCRLF) CR, LF, or CRLF (*BSR_UNICODE) any Unicode newline sequence @@ -495,9 +516,9 @@ Cambridge CB2 3QH, England.
REVISION
-Last updated: 21 November 2010 +Last updated: 26 April 2013
-Copyright © 1997-2010 University of Cambridge. +Copyright © 1997-2013 University of Cambridge.
Return to the PCRE index page.