|
version 1.1.1.3, 2013/07/22 08:25:57
|
version 1.1.1.4, 2014/06/15 19:46:05
|
|
Line 77 independent groups).
|
Line 77 independent groups).
|
| Automatic callouts can be used for tracking the progress of pattern matching. |
Automatic callouts can be used for tracking the progress of pattern matching. |
| The |
The |
| <a href="pcretest.html"><b>pcretest</b></a> |
<a href="pcretest.html"><b>pcretest</b></a> |
| command has an option that sets automatic callouts; when it is used, the output | program has a pattern qualifier (/C) that sets automatic callouts; when it is |
| indicates how the pattern is matched. This is useful information when you are | used, the output indicates how the pattern is being matched. This is useful |
| trying to optimize the performance of a particular pattern. | information when you are trying to optimize the performance of a particular |
| | pattern. |
| </P> |
</P> |
| <br><a name="SEC3" href="#TOC1">MISSING CALLOUTS</a><br> |
<br><a name="SEC3" href="#TOC1">MISSING CALLOUTS</a><br> |
| <P> |
<P> |
| You should be aware that, because of optimizations in the way PCRE matches | You should be aware that, because of optimizations in the way PCRE compiles and |
| patterns by default, callouts sometimes do not happen. For example, if the | matches patterns, callouts sometimes do not happen exactly as you might expect. |
| pattern is | </P> |
| | <P> |
| | At compile time, PCRE "auto-possessifies" repeated items when it knows that |
| | what follows cannot be part of the repeat. For example, a+[bc] is compiled as |
| | if it were a++[bc]. The <b>pcretest</b> output when this pattern is anchored and |
| | then applied with automatic callouts to the string "aaaa" is: |
| <pre> |
<pre> |
| |
--->aaaa |
| |
+0 ^ ^ |
| |
+1 ^ a+ |
| |
+3 ^ ^ [bc] |
| |
No match |
| |
</pre> |
| |
This indicates that when matching [bc] fails, there is no backtracking into a+ |
| |
and therefore the callouts that would be taken for the backtracks do not occur. |
| |
You can disable the auto-possessify feature by passing PCRE_NO_AUTO_POSSESS |
| |
to <b>pcre_compile()</b>, or starting the pattern with (*NO_AUTO_POSSESS). If |
| |
this is done in <b>pcretest</b> (using the /O qualifier), the output changes to |
| |
this: |
| |
<pre> |
| |
--->aaaa |
| |
+0 ^ ^ |
| |
+1 ^ a+ |
| |
+3 ^ ^ [bc] |
| |
+3 ^ ^ [bc] |
| |
+3 ^ ^ [bc] |
| |
+3 ^^ [bc] |
| |
No match |
| |
</pre> |
| |
This time, when matching [bc] fails, the matcher backtracks into a+ and tries |
| |
again, repeatedly, until a+ itself fails. |
| |
</P> |
| |
<P> |
| |
Other optimizations that provide fast "no match" results also affect callouts. |
| |
For example, if the pattern is |
| |
<pre> |
| ab(?C4)cd |
ab(?C4)cd |
| </pre> |
</pre> |
| PCRE knows that any matching string must contain the letter "d". If the subject |
PCRE knows that any matching string must contain the letter "d". If the subject |
|
Line 109 callouts such as the example above are obeyed.
|
Line 144 callouts such as the example above are obeyed.
|
| <br><a name="SEC4" href="#TOC1">THE CALLOUT INTERFACE</a><br> |
<br><a name="SEC4" href="#TOC1">THE CALLOUT INTERFACE</a><br> |
| <P> |
<P> |
| During matching, when PCRE reaches a callout point, the external function |
During matching, when PCRE reaches a callout point, the external function |
| defined by <i>pcre_callout</i> or <i>pcre[16|32]_callout</i> is called | defined by <i>pcre_callout</i> or <i>pcre[16|32]_callout</i> is called (if it is |
| (if it is set). This applies to both normal and DFA matching. The only | set). This applies to both normal and DFA matching. The only argument to the |
| argument to the callout function is a pointer to a <b>pcre_callout</b> | callout function is a pointer to a <b>pcre_callout</b> or |
| or <b>pcre[16|32]_callout</b> block. | <b>pcre[16|32]_callout</b> block. These structures contains the following |
| These structures contains the following fields: | fields: |
| <pre> |
<pre> |
| int <i>version</i>; |
int <i>version</i>; |
| int <i>callout_number</i>; |
int <i>callout_number</i>; |
|
Line 242 Cambridge CB2 3QH, England.
|
Line 277 Cambridge CB2 3QH, England.
|
| </P> |
</P> |
| <br><a name="SEC7" href="#TOC1">REVISION</a><br> |
<br><a name="SEC7" href="#TOC1">REVISION</a><br> |
| <P> |
<P> |
| Last updated: 03 March 2013 | Last updated: 12 November 2013 |
| <br> |
<br> |
| Copyright © 1997-2013 University of Cambridge. |
Copyright © 1997-2013 University of Cambridge. |
| <br> |
<br> |