--- embedaddon/pcre/doc/html/pcrematching.html 2012/02/21 23:05:52 1.1 +++ embedaddon/pcre/doc/html/pcrematching.html 2013/07/22 08:25:57 1.1.1.3 @@ -26,13 +26,17 @@ man page, in case the conversion went wrong.

This document describes the two different algorithms that are available in PCRE for matching a compiled regular expression against a given subject string. The -"standard" algorithm is the one provided by the pcre_exec() function. -This works in the same was as Perl's matching function, and provides a -Perl-compatible matching operation. +"standard" algorithm is the one provided by the pcre_exec(), +pcre16_exec() and pcre32_exec() functions. These work in the same +as as Perl's matching function, and provide a Perl-compatible matching operation. +The just-in-time (JIT) optimization that is described in the +pcrejit +documentation is compatible with these functions.

-An alternative algorithm is provided by the pcre_dfa_exec() function; -this operates in a different way, and is not Perl-compatible. It has advantages +An alternative algorithm is provided by the pcre_dfa_exec(), +pcre16_dfa_exec() and pcre32_dfa_exec() functions; they operate in +a different way, and are not Perl-compatible. This alternative has advantages and disadvantages compared with the standard algorithm, and these are described below.

@@ -163,10 +167,10 @@ and not on others), is not supported. It causes an err always 1, and the value of the capture_last field is always -1.

-7. The \C escape sequence, which (in the standard algorithm) matches a single -byte, even in UTF-8 mode, is not supported in UTF-8 mode, because the -alternative algorithm moves through the subject string one character at a time, -for all active paths through the tree. +7. The \C escape sequence, which (in the standard algorithm) always matches a +single data unit, even in UTF-8, UTF-16 or UTF-32 modes, is not supported in +these modes, because the alternative algorithm moves through the subject string +one character (not data unit) at a time, for all active paths through the tree.

8. Except for (*FAIL), the backtracking control verbs such as (*PRUNE) are not @@ -184,11 +188,11 @@ callouts.

2. Because the alternative algorithm scans the subject string just once, and -never needs to backtrack, it is possible to pass very long subject strings to -the matching function in several pieces, checking for partial matching each -time. Although it is possible to do multi-segment matching using the standard -algorithm (pcre_exec()), by retaining partially matched substrings, it is -more complicated. The +never needs to backtrack (except for lookbehinds), it is possible to pass very +long subject strings to the matching function in several pieces, checking for +partial matching each time. Although it is possible to do multi-segment +matching using the standard algorithm by retaining partially matched +substrings, it is more complicated. The pcrepartial documentation gives details of partial matching and discusses multi-segment matching. @@ -220,9 +224,9 @@ Cambridge CB2 3QH, England.


REVISION

-Last updated: 19 November 2011 +Last updated: 08 January 2012
-Copyright © 1997-2010 University of Cambridge. +Copyright © 1997-2012 University of Cambridge.

Return to the PCRE index page.