--- embedaddon/pcre/doc/pcresyntax.3 2012/02/21 23:05:52 1.1.1.1 +++ embedaddon/pcre/doc/pcresyntax.3 2012/02/21 23:50:25 1.1.1.2 @@ -9,8 +9,7 @@ PCRE are described in the .\" HREF \fBpcrepattern\fP .\" -documentation. This document contains just a quick-reference summary of the -syntax. +documentation. This document contains a quick-reference summary of the syntax. . . .SH "QUOTING" @@ -40,7 +39,7 @@ syntax. .sp . any character except newline; in dotall mode, any character whatsoever - \eC one byte, even in UTF-8 mode (best avoided) + \eC one data unit, even in UTF mode (best avoided) \ed a decimal digit \eD a character that is not a decimal digit \eh a horizontal whitespace character @@ -58,7 +57,7 @@ syntax. \eX an extended Unicode sequence .sp In PCRE, by default, \ed, \eD, \es, \eS, \ew, and \eW recognize only ASCII -characters, even in UTF-8 mode. However, this can be changed by setting the +characters, even in a UTF mode. However, this can be changed by setting the PCRE_UCP option. . . @@ -337,7 +336,8 @@ The following are recognized only at the start of a pa newline-setting options with similar syntax: .sp (*NO_START_OPT) no start-match optimization (PCRE_NO_START_OPTIMIZE) - (*UTF8) set UTF-8 mode (PCRE_UTF8) + (*UTF8) set UTF-8 mode: 8-bit library (PCRE_UTF8) + (*UTF16) set UTF-16 mode: 16-bit library (PCRE_UTF16) (*UCP) set PCRE_UCP (use Unicode properties for \ed etc) . . @@ -411,6 +411,7 @@ The following act immediately they are reached: .sp (*ACCEPT) force successful match (*FAIL) force backtrack; synonym (*F) + (*MARK:NAME) set name to be passed back; synonym (*:NAME) .sp The following act only when a subsequent match failure causes a backtrack to reach them. They all force a match failure, but they differ in what happens @@ -419,15 +420,19 @@ pattern is not anchored. .sp (*COMMIT) overall failure, no advance of starting point (*PRUNE) advance to next starting character - (*SKIP) advance start to current matching position + (*PRUNE:NAME) equivalent to (*MARK:NAME)(*PRUNE) + (*SKIP) advance to current matching position + (*SKIP:NAME) advance to position corresponding to an earlier + (*MARK:NAME); if not found, the (*SKIP) is ignored (*THEN) local failure, backtrack to next alternation + (*THEN:NAME) equivalent to (*MARK:NAME)(*THEN) . . .SH "NEWLINE CONVENTIONS" .rs .sp These are recognized only at the very start of the pattern or after a -(*BSR_...) or (*UTF8) or (*UCP) option. +(*BSR_...), (*UTF8), (*UTF16) or (*UCP) option. .sp (*CR) carriage return only (*LF) linefeed only @@ -440,7 +445,7 @@ These are recognized only at the very start of the pat .rs .sp These are recognized only at the very start of the pattern or after a -(*...) option that sets the newline convention or UTF-8 or UCP mode. +(*...) option that sets the newline convention or a UTF or UCP mode. .sp (*BSR_ANYCRLF) CR, LF, or CRLF (*BSR_UNICODE) any Unicode newline sequence @@ -474,6 +479,6 @@ Cambridge CB2 3QH, England. .rs .sp .nf -Last updated: 21 November 2010 -Copyright (c) 1997-2010 University of Cambridge. +Last updated: 10 January 2012 +Copyright (c) 1997-2012 University of Cambridge. .fi