version 1.1.1.1, 2012/02/21 23:05:52
|
version 1.1.1.3, 2013/07/22 08:25:57
|
Line 26 man page, in case the conversion went wrong.
|
Line 26 man page, in case the conversion went wrong.
|
<P> |
<P> |
This document describes the two different algorithms that are available in PCRE |
This document describes the two different algorithms that are available in PCRE |
for matching a compiled regular expression against a given subject string. The |
for matching a compiled regular expression against a given subject string. The |
"standard" algorithm is the one provided by the <b>pcre_exec()</b> function. | "standard" algorithm is the one provided by the <b>pcre_exec()</b>, |
This works in the same was as Perl's matching function, and provides a | <b>pcre16_exec()</b> and <b>pcre32_exec()</b> functions. These work in the same |
Perl-compatible matching operation. | as as Perl's matching function, and provide a Perl-compatible matching operation. |
| The just-in-time (JIT) optimization that is described in the |
| <a href="pcrejit.html"><b>pcrejit</b></a> |
| documentation is compatible with these functions. |
</P> |
</P> |
<P> |
<P> |
An alternative algorithm is provided by the <b>pcre_dfa_exec()</b> function; | An alternative algorithm is provided by the <b>pcre_dfa_exec()</b>, |
this operates in a different way, and is not Perl-compatible. It has advantages | <b>pcre16_dfa_exec()</b> and <b>pcre32_dfa_exec()</b> functions; they operate in |
| a different way, and are not Perl-compatible. This alternative has advantages |
and disadvantages compared with the standard algorithm, and these are described |
and disadvantages compared with the standard algorithm, and these are described |
below. |
below. |
</P> |
</P> |
Line 163 and not on others), is not supported. It causes an err
|
Line 167 and not on others), is not supported. It causes an err
|
always 1, and the value of the <i>capture_last</i> field is always -1. |
always 1, and the value of the <i>capture_last</i> field is always -1. |
</P> |
</P> |
<P> |
<P> |
7. The \C escape sequence, which (in the standard algorithm) matches a single | 7. The \C escape sequence, which (in the standard algorithm) always matches a |
byte, even in UTF-8 mode, is not supported in UTF-8 mode, because the | single data unit, even in UTF-8, UTF-16 or UTF-32 modes, is not supported in |
alternative algorithm moves through the subject string one character at a time, | these modes, because the alternative algorithm moves through the subject string |
for all active paths through the tree. | one character (not data unit) at a time, for all active paths through the tree. |
</P> |
</P> |
<P> |
<P> |
8. Except for (*FAIL), the backtracking control verbs such as (*PRUNE) are not |
8. Except for (*FAIL), the backtracking control verbs such as (*PRUNE) are not |
Line 184 callouts.
|
Line 188 callouts.
|
</P> |
</P> |
<P> |
<P> |
2. Because the alternative algorithm scans the subject string just once, and |
2. Because the alternative algorithm scans the subject string just once, and |
never needs to backtrack, it is possible to pass very long subject strings to | never needs to backtrack (except for lookbehinds), it is possible to pass very |
the matching function in several pieces, checking for partial matching each | long subject strings to the matching function in several pieces, checking for |
time. Although it is possible to do multi-segment matching using the standard | partial matching each time. Although it is possible to do multi-segment |
algorithm (<b>pcre_exec()</b>), by retaining partially matched substrings, it is | matching using the standard algorithm by retaining partially matched |
more complicated. The | substrings, it is more complicated. The |
<a href="pcrepartial.html"><b>pcrepartial</b></a> |
<a href="pcrepartial.html"><b>pcrepartial</b></a> |
documentation gives details of partial matching and discusses multi-segment |
documentation gives details of partial matching and discusses multi-segment |
matching. |
matching. |
Line 220 Cambridge CB2 3QH, England.
|
Line 224 Cambridge CB2 3QH, England.
|
</P> |
</P> |
<br><a name="SEC8" href="#TOC1">REVISION</a><br> |
<br><a name="SEC8" href="#TOC1">REVISION</a><br> |
<P> |
<P> |
Last updated: 19 November 2011 | Last updated: 08 January 2012 |
<br> |
<br> |
Copyright © 1997-2010 University of Cambridge. | Copyright © 1997-2012 University of Cambridge. |
<br> |
<br> |
<p> |
<p> |
Return to the <a href="index.html">PCRE index page</a>. |
Return to the <a href="index.html">PCRE index page</a>. |