Diff for /embedaddon/pcre/doc/pcrematching.3 between versions 1.1.1.1 and 1.1.1.4

version 1.1.1.1, 2012/02/21 23:05:52 version 1.1.1.4, 2013/07/22 08:25:56
Line 1 Line 1
.TH PCREMATCHING 3.TH PCREMATCHING 3 "08 January 2012" "PCRE 8.30"
 .SH NAME  .SH NAME
 PCRE - Perl-compatible regular expressions  PCRE - Perl-compatible regular expressions
 .SH "PCRE MATCHING ALGORITHMS"  .SH "PCRE MATCHING ALGORITHMS"
Line 6  PCRE - Perl-compatible regular expressions Line 6  PCRE - Perl-compatible regular expressions
 .sp  .sp
 This document describes the two different algorithms that are available in PCRE  This document describes the two different algorithms that are available in PCRE
 for matching a compiled regular expression against a given subject string. The  for matching a compiled regular expression against a given subject string. The
"standard" algorithm is the one provided by the \fBpcre_exec()\fP function."standard" algorithm is the one provided by the \fBpcre_exec()\fP,
This works in the same was as Perl's matching function, and provides a\fBpcre16_exec()\fP and \fBpcre32_exec()\fP functions. These work in the same
Perl-compatible matching operation.as as Perl's matching function, and provide a Perl-compatible matching operation.
 The just-in-time (JIT) optimization that is described in the
 .\" HREF
 \fBpcrejit\fP
 .\"
 documentation is compatible with these functions.
 .P  .P
An alternative algorithm is provided by the \fBpcre_dfa_exec()\fP function;An alternative algorithm is provided by the \fBpcre_dfa_exec()\fP,
this operates in a different way, and is not Perl-compatible. It has advantages\fBpcre16_dfa_exec()\fP and \fBpcre32_dfa_exec()\fP functions; they operate in
 a different way, and are not Perl-compatible. This alternative has advantages
 and disadvantages compared with the standard algorithm, and these are described  and disadvantages compared with the standard algorithm, and these are described
 below.  below.
 .P  .P
Line 28  is matched against the string Line 34  is matched against the string
 there are three possible answers. The standard algorithm finds only one of  there are three possible answers. The standard algorithm finds only one of
 them, whereas the alternative algorithm finds all three.  them, whereas the alternative algorithm finds all three.
 .  .
   .
 .SH "REGULAR EXPRESSIONS AS TREES"  .SH "REGULAR EXPRESSIONS AS TREES"
 .rs  .rs
 .sp  .sp
Line 38  string (from a given starting point) can be thought of Line 45  string (from a given starting point) can be thought of
 There are two ways to search a tree: depth-first and breadth-first, and these  There are two ways to search a tree: depth-first and breadth-first, and these
 correspond to the two matching algorithms provided by PCRE.  correspond to the two matching algorithms provided by PCRE.
 .  .
   .
 .SH "THE STANDARD MATCHING ALGORITHM"  .SH "THE STANDARD MATCHING ALGORITHM"
 .rs  .rs
 .sp  .sp
Line 63  straightforward for this algorithm to keep track of th Line 71  straightforward for this algorithm to keep track of th
 matched by portions of the pattern in parentheses. This provides support for  matched by portions of the pattern in parentheses. This provides support for
 capturing parentheses and back references.  capturing parentheses and back references.
 .  .
   .
 .SH "THE ALTERNATIVE MATCHING ALGORITHM"  .SH "THE ALTERNATIVE MATCHING ALGORITHM"
 .rs  .rs
 .sp  .sp
Line 131  and not on others), is not supported. It causes an err Line 140  and not on others), is not supported. It causes an err
 6. Callouts are supported, but the value of the \fIcapture_top\fP field is  6. Callouts are supported, but the value of the \fIcapture_top\fP field is
 always 1, and the value of the \fIcapture_last\fP field is always -1.  always 1, and the value of the \fIcapture_last\fP field is always -1.
 .P  .P
7. The \eC escape sequence, which (in the standard algorithm) matches a single7. The \eC escape sequence, which (in the standard algorithm) always matches a
byte, even in UTF-8 mode, is not supported in UTF-8 mode, because thesingle data unit, even in UTF-8, UTF-16 or UTF-32 modes, is not supported in
alternative algorithm moves through the subject string one character at a time,these modes, because the alternative algorithm moves through the subject string
for all active paths through the tree.one character (not data unit) at a time, for all active paths through the tree.
 .P  .P
 8. Except for (*FAIL), the backtracking control verbs such as (*PRUNE) are not  8. Except for (*FAIL), the backtracking control verbs such as (*PRUNE) are not
 supported. (*FAIL) is supported, and behaves like a failing negative assertion.  supported. (*FAIL) is supported, and behaves like a failing negative assertion.
 .  .
   .
 .SH "ADVANTAGES OF THE ALTERNATIVE ALGORITHM"  .SH "ADVANTAGES OF THE ALTERNATIVE ALGORITHM"
 .rs  .rs
 .sp  .sp
Line 150  match using the standard algorithm, you have to do klu Line 160  match using the standard algorithm, you have to do klu
 callouts.  callouts.
 .P  .P
 2. Because the alternative algorithm scans the subject string just once, and  2. Because the alternative algorithm scans the subject string just once, and
never needs to backtrack, it is possible to pass very long subject strings tonever needs to backtrack (except for lookbehinds), it is possible to pass very
the matching function in several pieces, checking for partial matching eachlong subject strings to the matching function in several pieces, checking for
time. Although it is possible to do multi-segment matching using the standardpartial matching each time. Although it is possible to do multi-segment
algorithm (\fBpcre_exec()\fP), by retaining partially matched substrings, it ismatching using the standard algorithm by retaining partially matched
more complicated. Thesubstrings, it is more complicated. The
 .\" HREF  .\" HREF
 \fBpcrepartial\fP  \fBpcrepartial\fP
 .\"  .\"
Line 191  Cambridge CB2 3QH, England. Line 201  Cambridge CB2 3QH, England.
 .rs  .rs
 .sp  .sp
 .nf  .nf
Last updated: 19 November 2011Last updated: 08 January 2012
Copyright (c) 1997-2010 University of Cambridge.Copyright (c) 1997-2012 University of Cambridge.
 .fi  .fi

Removed from v.1.1.1.1  
changed lines
  Added in v.1.1.1.4


FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>