version 1.1.1.2, 2012/02/21 23:50:25
|
version 1.1.1.4, 2014/06/15 19:46:05
|
Line 26 man page, in case the conversion went wrong.
|
Line 26 man page, in case the conversion went wrong.
|
<P> |
<P> |
This document describes the two different algorithms that are available in PCRE |
This document describes the two different algorithms that are available in PCRE |
for matching a compiled regular expression against a given subject string. The |
for matching a compiled regular expression against a given subject string. The |
"standard" algorithm is the one provided by the <b>pcre_exec()</b> and | "standard" algorithm is the one provided by the <b>pcre_exec()</b>, |
<b>pcre16_exec()</b> functions. These work in the same was as Perl's matching | <b>pcre16_exec()</b> and <b>pcre32_exec()</b> functions. These work in the same |
function, and provide a Perl-compatible matching operation. The just-in-time | as as Perl's matching function, and provide a Perl-compatible matching operation. |
(JIT) optimization that is described in the | The just-in-time (JIT) optimization that is described in the |
<a href="pcrejit.html"><b>pcrejit</b></a> |
<a href="pcrejit.html"><b>pcrejit</b></a> |
documentation is compatible with these functions. |
documentation is compatible with these functions. |
</P> |
</P> |
<P> |
<P> |
An alternative algorithm is provided by the <b>pcre_dfa_exec()</b> and | An alternative algorithm is provided by the <b>pcre_dfa_exec()</b>, |
<b>pcre16_dfa_exec()</b> functions; they operate in a different way, and are not | <b>pcre16_dfa_exec()</b> and <b>pcre32_dfa_exec()</b> functions; they operate in |
Perl-compatible. This alternative has advantages and disadvantages compared | a different way, and are not Perl-compatible. This alternative has advantages |
with the standard algorithm, and these are described below. | and disadvantages compared with the standard algorithm, and these are described |
| below. |
</P> |
</P> |
<P> |
<P> |
When there is only one possible way in which a given subject string can match a |
When there is only one possible way in which a given subject string can match a |
Line 125 character of the subject. The algorithm does not autom
|
Line 126 character of the subject. The algorithm does not autom
|
matches that start at later positions. |
matches that start at later positions. |
</P> |
</P> |
<P> |
<P> |
|
PCRE's "auto-possessification" optimization usually applies to character |
|
repeats at the end of a pattern (as well as internally). For example, the |
|
pattern "a\d+" is compiled as if it were "a\d++" because there is no point |
|
even considering the possibility of backtracking into the repeated digits. For |
|
DFA matching, this means that only one possible match is found. If you really |
|
do want multiple matches in such cases, either use an ungreedy repeat |
|
("a\d+?") or set the PCRE_NO_AUTO_POSSESS option when compiling. |
|
</P> |
|
<P> |
There are a number of features of PCRE regular expressions that are not |
There are a number of features of PCRE regular expressions that are not |
supported by the alternative matching algorithm. They are as follows: |
supported by the alternative matching algorithm. They are as follows: |
</P> |
</P> |
Line 167 always 1, and the value of the <i>capture_last</i> fie
|
Line 177 always 1, and the value of the <i>capture_last</i> fie
|
</P> |
</P> |
<P> |
<P> |
7. The \C escape sequence, which (in the standard algorithm) always matches a |
7. The \C escape sequence, which (in the standard algorithm) always matches a |
single data unit, even in UTF-8 or UTF-16 modes, is not supported in these | single data unit, even in UTF-8, UTF-16 or UTF-32 modes, is not supported in |
modes, because the alternative algorithm moves through the subject string one | these modes, because the alternative algorithm moves through the subject string |
character (not data unit) at a time, for all active paths through the tree. | one character (not data unit) at a time, for all active paths through the tree. |
</P> |
</P> |
<P> |
<P> |
8. Except for (*FAIL), the backtracking control verbs such as (*PRUNE) are not |
8. Except for (*FAIL), the backtracking control verbs such as (*PRUNE) are not |
Line 223 Cambridge CB2 3QH, England.
|
Line 233 Cambridge CB2 3QH, England.
|
</P> |
</P> |
<br><a name="SEC8" href="#TOC1">REVISION</a><br> |
<br><a name="SEC8" href="#TOC1">REVISION</a><br> |
<P> |
<P> |
Last updated: 08 January 2012 | Last updated: 12 November 2013 |
<br> |
<br> |
Copyright © 1997-2012 University of Cambridge. |
Copyright © 1997-2012 University of Cambridge. |
<br> |
<br> |