version 1.1.1.4, 2013/07/22 08:25:56
|
version 1.1.1.5, 2014/06/15 19:46:05
|
Line 1
|
Line 1
|
.TH PCRECALLOUT 3 "03 March 2013" "PCRE 8.33" | .TH PCRECALLOUT 3 "12 November 2013" "PCRE 8.34" |
.SH NAME |
.SH NAME |
PCRE - Perl-compatible regular expressions |
PCRE - Perl-compatible regular expressions |
.SH SYNOPSIS |
.SH SYNOPSIS |
Line 55 The
|
Line 55 The
|
.\" HREF |
.\" HREF |
\fBpcretest\fP |
\fBpcretest\fP |
.\" |
.\" |
command has an option that sets automatic callouts; when it is used, the output | program has a pattern qualifier (/C) that sets automatic callouts; when it is |
indicates how the pattern is matched. This is useful information when you are | used, the output indicates how the pattern is being matched. This is useful |
trying to optimize the performance of a particular pattern. | information when you are trying to optimize the performance of a particular |
| pattern. |
. |
. |
. |
. |
.SH "MISSING CALLOUTS" |
.SH "MISSING CALLOUTS" |
.rs |
.rs |
.sp |
.sp |
You should be aware that, because of optimizations in the way PCRE matches | You should be aware that, because of optimizations in the way PCRE compiles and |
patterns by default, callouts sometimes do not happen. For example, if the | matches patterns, callouts sometimes do not happen exactly as you might expect. |
pattern is | .P |
| At compile time, PCRE "auto-possessifies" repeated items when it knows that |
| what follows cannot be part of the repeat. For example, a+[bc] is compiled as |
| if it were a++[bc]. The \fBpcretest\fP output when this pattern is anchored and |
| then applied with automatic callouts to the string "aaaa" is: |
.sp |
.sp |
|
--->aaaa |
|
+0 ^ ^ |
|
+1 ^ a+ |
|
+3 ^ ^ [bc] |
|
No match |
|
.sp |
|
This indicates that when matching [bc] fails, there is no backtracking into a+ |
|
and therefore the callouts that would be taken for the backtracks do not occur. |
|
You can disable the auto-possessify feature by passing PCRE_NO_AUTO_POSSESS |
|
to \fBpcre_compile()\fP, or starting the pattern with (*NO_AUTO_POSSESS). If |
|
this is done in \fBpcretest\fP (using the /O qualifier), the output changes to |
|
this: |
|
.sp |
|
--->aaaa |
|
+0 ^ ^ |
|
+1 ^ a+ |
|
+3 ^ ^ [bc] |
|
+3 ^ ^ [bc] |
|
+3 ^ ^ [bc] |
|
+3 ^^ [bc] |
|
No match |
|
.sp |
|
This time, when matching [bc] fails, the matcher backtracks into a+ and tries |
|
again, repeatedly, until a+ itself fails. |
|
.P |
|
Other optimizations that provide fast "no match" results also affect callouts. |
|
For example, if the pattern is |
|
.sp |
ab(?C4)cd |
ab(?C4)cd |
.sp |
.sp |
PCRE knows that any matching string must contain the letter "d". If the subject |
PCRE knows that any matching string must contain the letter "d". If the subject |
Line 89 callouts such as the example above are obeyed.
|
Line 122 callouts such as the example above are obeyed.
|
.rs |
.rs |
.sp |
.sp |
During matching, when PCRE reaches a callout point, the external function |
During matching, when PCRE reaches a callout point, the external function |
defined by \fIpcre_callout\fP or \fIpcre[16|32]_callout\fP is called | defined by \fIpcre_callout\fP or \fIpcre[16|32]_callout\fP is called (if it is |
(if it is set). This applies to both normal and DFA matching. The only | set). This applies to both normal and DFA matching. The only argument to the |
argument to the callout function is a pointer to a \fBpcre_callout\fP | callout function is a pointer to a \fBpcre_callout\fP or |
or \fBpcre[16|32]_callout\fP block. | \fBpcre[16|32]_callout\fP block. These structures contains the following |
These structures contains the following fields: | fields: |
.sp |
.sp |
int \fIversion\fP; |
int \fIversion\fP; |
int \fIcallout_number\fP; |
int \fIcallout_number\fP; |
Line 217 Cambridge CB2 3QH, England.
|
Line 250 Cambridge CB2 3QH, England.
|
.rs |
.rs |
.sp |
.sp |
.nf |
.nf |
Last updated: 03 March 2013 | Last updated: 12 November 2013 |
Copyright (c) 1997-2013 University of Cambridge. |
Copyright (c) 1997-2013 University of Cambridge. |
.fi |
.fi |