version 1.1.1.3, 2012/10/09 09:19:18
|
version 1.1.1.4, 2013/07/22 08:25:57
|
Line 14 man page, in case the conversion went wrong.
|
Line 14 man page, in case the conversion went wrong.
|
<br> |
<br> |
<ul> |
<ul> |
<li><a name="TOC1" href="#SEC1">SYNOPSIS</a> |
<li><a name="TOC1" href="#SEC1">SYNOPSIS</a> |
<li><a name="TOC2" href="#SEC2">PCRE's 8-BIT and 16-BIT LIBRARIES</a> | <li><a name="TOC2" href="#SEC2">INPUT DATA FORMAT</a> |
<li><a name="TOC3" href="#SEC3">COMMAND LINE OPTIONS</a> | <li><a name="TOC3" href="#SEC3">PCRE's 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a> |
<li><a name="TOC4" href="#SEC4">DESCRIPTION</a> | <li><a name="TOC4" href="#SEC4">COMMAND LINE OPTIONS</a> |
<li><a name="TOC5" href="#SEC5">PATTERN MODIFIERS</a> | <li><a name="TOC5" href="#SEC5">DESCRIPTION</a> |
<li><a name="TOC6" href="#SEC6">DATA LINES</a> | <li><a name="TOC6" href="#SEC6">PATTERN MODIFIERS</a> |
<li><a name="TOC7" href="#SEC7">THE ALTERNATIVE MATCHING FUNCTION</a> | <li><a name="TOC7" href="#SEC7">DATA LINES</a> |
<li><a name="TOC8" href="#SEC8">DEFAULT OUTPUT FROM PCRETEST</a> | <li><a name="TOC8" href="#SEC8">THE ALTERNATIVE MATCHING FUNCTION</a> |
<li><a name="TOC9" href="#SEC9">OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION</a> | <li><a name="TOC9" href="#SEC9">DEFAULT OUTPUT FROM PCRETEST</a> |
<li><a name="TOC10" href="#SEC10">RESTARTING AFTER A PARTIAL MATCH</a> | <li><a name="TOC10" href="#SEC10">OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION</a> |
<li><a name="TOC11" href="#SEC11">CALLOUTS</a> | <li><a name="TOC11" href="#SEC11">RESTARTING AFTER A PARTIAL MATCH</a> |
<li><a name="TOC12" href="#SEC12">NON-PRINTING CHARACTERS</a> | <li><a name="TOC12" href="#SEC12">CALLOUTS</a> |
<li><a name="TOC13" href="#SEC13">SAVING AND RELOADING COMPILED PATTERNS</a> | <li><a name="TOC13" href="#SEC13">NON-PRINTING CHARACTERS</a> |
<li><a name="TOC14" href="#SEC14">SEE ALSO</a> | <li><a name="TOC14" href="#SEC14">SAVING AND RELOADING COMPILED PATTERNS</a> |
<li><a name="TOC15" href="#SEC15">AUTHOR</a> | <li><a name="TOC15" href="#SEC15">SEE ALSO</a> |
<li><a name="TOC16" href="#SEC16">REVISION</a> | <li><a name="TOC16" href="#SEC16">AUTHOR</a> |
| <li><a name="TOC17" href="#SEC17">REVISION</a> |
</ul> |
</ul> |
<br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br> |
<br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br> |
<P> |
<P> |
Line 43 details of the regular expressions themselves, see the
|
Line 44 details of the regular expressions themselves, see the
|
documentation. For details of the PCRE library function calls and their |
documentation. For details of the PCRE library function calls and their |
options, see the |
options, see the |
<a href="pcreapi.html"><b>pcreapi</b></a> |
<a href="pcreapi.html"><b>pcreapi</b></a> |
and | , |
<a href="pcre16.html"><b>pcre16</b></a> |
<a href="pcre16.html"><b>pcre16</b></a> |
documentation. The input for <b>pcretest</b> is a sequence of regular expression | and |
patterns and strings to be matched, as described below. The output shows the | <a href="pcre32.html"><b>pcre32</b></a> |
result of each match. Options on the command line and the patterns control PCRE | documentation. |
options and exactly what is output. | |
</P> |
</P> |
<br><a name="SEC2" href="#TOC1">PCRE's 8-BIT and 16-BIT LIBRARIES</a><br> |
|
<P> |
<P> |
|
The input for <b>pcretest</b> is a sequence of regular expression patterns and |
|
strings to be matched, as described below. The output shows the result of each |
|
match. Options on the command line and the patterns control PCRE options and |
|
exactly what is output. |
|
</P> |
|
<P> |
|
As PCRE has evolved, it has acquired many different features, and as a result, |
|
<b>pcretest</b> now has rather a lot of obscure options for testing every |
|
possible feature. Some of these options are specifically designed for use in |
|
conjunction with the test script and data files that are distributed as part of |
|
PCRE, and are unlikely to be of use otherwise. They are all documented here, |
|
but without much justification. |
|
</P> |
|
<br><a name="SEC2" href="#TOC1">INPUT DATA FORMAT</a><br> |
|
<P> |
|
Input to <b>pcretest</b> is processed line by line, either by calling the C |
|
library's <b>fgets()</b> function, or via the <b>libreadline</b> library (see |
|
below). In Unix-like environments, <b>fgets()</b> treats any bytes other than |
|
newline as data characters. However, in some Windows environments character 26 |
|
(hex 1A) causes an immediate end of file, and no further data is read. For |
|
maximum portability, therefore, it is safest to use only ASCII characters in |
|
<b>pcretest</b> input files. |
|
</P> |
|
<br><a name="SEC3" href="#TOC1">PCRE's 8-BIT, 16-BIT AND 32-BIT LIBRARIES</a><br> |
|
<P> |
From release 8.30, two separate PCRE libraries can be built. The original one |
From release 8.30, two separate PCRE libraries can be built. The original one |
supports 8-bit character strings, whereas the newer 16-bit library supports |
supports 8-bit character strings, whereas the newer 16-bit library supports |
character strings encoded in 16-bit units. The <b>pcretest</b> program can be | character strings encoded in 16-bit units. From release 8.32, a third library |
used to test both libraries. However, it is itself still an 8-bit program, | can be built, supporting character strings encoded in 32-bit units. The |
reading 8-bit input and writing 8-bit output. When testing the 16-bit library, | <b>pcretest</b> program can be used to test all three libraries. However, it is |
the patterns and data strings are converted to 16-bit format before being | itself still an 8-bit program, reading 8-bit input and writing 8-bit output. |
passed to the PCRE library functions. Results are converted to 8-bit for | When testing the 16-bit or 32-bit library, the patterns and data strings are |
output. | converted to 16- or 32-bit format before being passed to the PCRE library |
| functions. Results are converted to 8-bit for output. |
</P> |
</P> |
<P> |
<P> |
References to functions and structures of the form <b>pcre[16]_xx</b> below | References to functions and structures of the form <b>pcre[16|32]_xx</b> below |
mean "<b>pcre_xx</b> when using the 8-bit library or <b>pcre16_xx</b> when using | mean "<b>pcre_xx</b> when using the 8-bit library, <b>pcre16_xx</b> when using |
the 16-bit library". | the 16-bit library, or <b>pcre32_xx</b> when using the 32-bit library". |
</P> |
</P> |
<br><a name="SEC3" href="#TOC1">COMMAND LINE OPTIONS</a><br> | <br><a name="SEC4" href="#TOC1">COMMAND LINE OPTIONS</a><br> |
<P> |
<P> |
<b>-16</b> | <b>-8</b> |
If both the 8-bit and the 16-bit libraries have been built, this option causes | If both the 8-bit library has been built, this option causes the 8-bit library |
the 16-bit library to be used. If only the 16-bit library has been built, this | to be used (which is the default); if the 8-bit library has not been built, |
is the default (so has no effect). If only the 8-bit library has been built, | |
this option causes an error. |
this option causes an error. |
</P> |
</P> |
<P> |
<P> |
|
<b>-16</b> |
|
If both the 8-bit or the 32-bit, and the 16-bit libraries have been built, this |
|
option causes the 16-bit library to be used. If only the 16-bit library has been |
|
built, this is the default (so has no effect). If only the 8-bit or the 32-bit |
|
library has been built, this option causes an error. |
|
</P> |
|
<P> |
|
<b>-32</b> |
|
If both the 8-bit or the 16-bit, and the 32-bit libraries have been built, this |
|
option causes the 32-bit library to be used. If only the 32-bit library has been |
|
built, this is the default (so has no effect). If only the 8-bit or the 16-bit |
|
library has been built, this option causes an error. |
|
</P> |
|
<P> |
<b>-b</b> |
<b>-b</b> |
Behave as if each pattern has the <b>/B</b> (show byte code) modifier; the |
Behave as if each pattern has the <b>/B</b> (show byte code) modifier; the |
internal form is output after compilation. |
internal form is output after compilation. |
Line 82 internal form is output after compilation.
|
Line 120 internal form is output after compilation.
|
<P> |
<P> |
<b>-C</b> |
<b>-C</b> |
Output the version number of the PCRE library, and all available information |
Output the version number of the PCRE library, and all available information |
about the optional features that are included, and then exit. All other options | about the optional features that are included, and then exit with zero exit |
are ignored. | code. All other options are ignored. |
</P> |
</P> |
<P> |
<P> |
<b>-C</b> <i>option</i> |
<b>-C</b> <i>option</i> |
Output information about a specific build-time option, then exit. This |
Output information about a specific build-time option, then exit. This |
functionality is intended for use in scripts such as <b>RunTest</b>. The |
functionality is intended for use in scripts such as <b>RunTest</b>. The |
following options output the value indicated: | following options output the value and set the exit code as indicated: |
<pre> |
<pre> |
linksize the internal link size (2, 3, or 4) | ebcdic-nl the code for LF (= NL) in an EBCDIC environment: |
| 0x15 or 0x25 |
| 0 if used in an ASCII environment |
| exit code is always 0 |
| linksize the configured internal link size (2, 3, or 4) |
| exit code is set to the link size |
newline the default newline setting: |
newline the default newline setting: |
CR, LF, CRLF, ANYCRLF, or ANY |
CR, LF, CRLF, ANYCRLF, or ANY |
|
exit code is always 0 |
</pre> |
</pre> |
The following options output 1 for true or zero for false: | The following options output 1 for true or 0 for false, and set the exit code |
| to the same value: |
<pre> |
<pre> |
|
ebcdic compiled for an EBCDIC environment |
jit just-in-time support is available |
jit just-in-time support is available |
pcre16 the 16-bit library was built |
pcre16 the 16-bit library was built |
|
pcre32 the 32-bit library was built |
pcre8 the 8-bit library was built |
pcre8 the 8-bit library was built |
ucp Unicode property support is available |
ucp Unicode property support is available |
utf UTF-8 and/or UTF-16 support is available | utf UTF-8 and/or UTF-16 and/or UTF-32 support |
</PRE> | is available |
| </pre> |
| If an unknown option is given, an error message is output; the exit code is 0. |
</P> |
</P> |
<P> |
<P> |
<b>-d</b> |
<b>-d</b> |
Line 113 form and information about the compiled pattern is out
|
Line 162 form and information about the compiled pattern is out
|
<P> |
<P> |
<b>-dfa</b> |
<b>-dfa</b> |
Behave as if each data line contains the \D escape sequence; this causes the |
Behave as if each data line contains the \D escape sequence; this causes the |
alternative matching function, <b>pcre[16]_dfa_exec()</b>, to be used instead of | alternative matching function, <b>pcre[16|32]_dfa_exec()</b>, to be used instead |
the standard <b>pcre[16]_exec()</b> function (more detail is given below). | of the standard <b>pcre[16|32]_exec()</b> function (more detail is given below). |
</P> |
</P> |
<P> |
<P> |
<b>-help</b> |
<b>-help</b> |
Line 129 compiled pattern is given after compilation.
|
Line 178 compiled pattern is given after compilation.
|
<b>-M</b> |
<b>-M</b> |
Behave as if each data line contains the \M escape sequence; this causes |
Behave as if each data line contains the \M escape sequence; this causes |
PCRE to discover the minimum MATCH_LIMIT and MATCH_LIMIT_RECURSION settings by |
PCRE to discover the minimum MATCH_LIMIT and MATCH_LIMIT_RECURSION settings by |
calling <b>pcre[16]_exec()</b> repeatedly with different limits. | calling <b>pcre[16|32]_exec()</b> repeatedly with different limits. |
</P> |
</P> |
<P> |
<P> |
<b>-m</b> |
<b>-m</b> |
Line 140 bytes for both libraries.
|
Line 189 bytes for both libraries.
|
<P> |
<P> |
<b>-o</b> <i>osize</i> |
<b>-o</b> <i>osize</i> |
Set the number of elements in the output vector that is used when calling |
Set the number of elements in the output vector that is used when calling |
<b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b> to be <i>osize</i>. The | <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> to be <i>osize</i>. The |
default value is 45, which is enough for 14 capturing subexpressions for |
default value is 45, which is enough for 14 capturing subexpressions for |
<b>pcre[16]_exec()</b> or 22 different matches for <b>pcre[16]_dfa_exec()</b>. | <b>pcre[16|32]_exec()</b> or 22 different matches for |
| <b>pcre[16|32]_dfa_exec()</b>. |
The vector size can be changed for individual matching calls by including \O |
The vector size can be changed for individual matching calls by including \O |
in the data line (see below). |
in the data line (see below). |
</P> |
</P> |
Line 165 megabytes.
|
Line 215 megabytes.
|
<b>-s</b> or <b>-s+</b> |
<b>-s</b> or <b>-s+</b> |
Behave as if each pattern has the <b>/S</b> modifier; in other words, force each |
Behave as if each pattern has the <b>/S</b> modifier; in other words, force each |
pattern to be studied. If <b>-s+</b> is used, all the JIT compile options are |
pattern to be studied. If <b>-s+</b> is used, all the JIT compile options are |
passed to <b>pcre[16]_study()</b>, causing just-in-time optimization to be set | passed to <b>pcre[16|32]_study()</b>, causing just-in-time optimization to be set |
up if it is available, for both full and partial matching. Specific JIT compile |
up if it is available, for both full and partial matching. Specific JIT compile |
options can be selected by following <b>-s+</b> with a digit in the range 1 to |
options can be selected by following <b>-s+</b> with a digit in the range 1 to |
7, which selects the JIT compile modes as follows: |
7, which selects the JIT compile modes as follows: |
Line 180 options can be selected by following <b>-s+</b> with a
|
Line 230 options can be selected by following <b>-s+</b> with a
|
If <b>-s++</b> is used instead of <b>-s+</b> (with or without a following digit), |
If <b>-s++</b> is used instead of <b>-s+</b> (with or without a following digit), |
the text "(JIT)" is added to the first output line after a match or no match |
the text "(JIT)" is added to the first output line after a match or no match |
when JIT-compiled code was actually used. |
when JIT-compiled code was actually used. |
</P> | <br> |
<P> | <br> |
| Note that there are pattern options that can override <b>-s</b>, either |
| specifying no studying at all, or suppressing JIT compilation. |
| <br> |
| <br> |
If the <b>/I</b> or <b>/D</b> option is present on a pattern (requesting output |
If the <b>/I</b> or <b>/D</b> option is present on a pattern (requesting output |
about the compiled pattern), information about the result of studying is not |
about the compiled pattern), information about the result of studying is not |
included when studying is caused only by <b>-s</b> and neither <b>-i</b> nor |
included when studying is caused only by <b>-s</b> and neither <b>-i</b> nor |
Line 215 to iterate 500000 times.
|
Line 269 to iterate 500000 times.
|
This is like <b>-t</b> except that it times only the matching phase, not the |
This is like <b>-t</b> except that it times only the matching phase, not the |
compile or study phases. |
compile or study phases. |
</P> |
</P> |
<br><a name="SEC4" href="#TOC1">DESCRIPTION</a><br> | <br><a name="SEC5" href="#TOC1">DESCRIPTION</a><br> |
<P> |
<P> |
If <b>pcretest</b> is given two filename arguments, it reads from the first and |
If <b>pcretest</b> is given two filename arguments, it reads from the first and |
writes to the second. If it is given only one filename argument, it reads from |
writes to the second. If it is given only one filename argument, it reads from |
Line 272 backslash, because
|
Line 326 backslash, because
|
is interpreted as the first line of a pattern that starts with "abc/", causing |
is interpreted as the first line of a pattern that starts with "abc/", causing |
pcretest to read the next line as a continuation of the regular expression. |
pcretest to read the next line as a continuation of the regular expression. |
</P> |
</P> |
<br><a name="SEC5" href="#TOC1">PATTERN MODIFIERS</a><br> | <br><a name="SEC6" href="#TOC1">PATTERN MODIFIERS</a><br> |
<P> |
<P> |
A pattern may be followed by any number of modifiers, which are mostly single |
A pattern may be followed by any number of modifiers, which are mostly single |
characters. Following Perl usage, these are referred to below as, for example, | characters, though some of these can be qualified by further characters. |
"the <b>/i</b> modifier", even though the delimiter of the pattern need not | Following Perl usage, these are referred to below as, for example, "the |
always be a slash, and no slash is used when writing modifiers. White space may | <b>/i</b> modifier", even though the delimiter of the pattern need not always be |
appear between the final pattern delimiter and the first modifier, and between | a slash, and no slash is used when writing modifiers. White space may appear |
the modifiers themselves. | between the final pattern delimiter and the first modifier, and between the |
| modifiers themselves. For reference, here is a complete list of modifiers. They |
| fall into several groups that are described in detail in the following |
| sections. |
| <pre> |
| <b>/8</b> set UTF mode |
| <b>/9</b> set PCRE_NEVER_UTF (locks out UTF mode) |
| <b>/?</b> disable UTF validity check |
| <b>/+</b> show remainder of subject after match |
| <b>/=</b> show all captures (not just those that are set) |
| |
| <b>/A</b> set PCRE_ANCHORED |
| <b>/B</b> show compiled code |
| <b>/C</b> set PCRE_AUTO_CALLOUT |
| <b>/D</b> same as <b>/B</b> plus <b>/I</b> |
| <b>/E</b> set PCRE_DOLLAR_ENDONLY |
| <b>/F</b> flip byte order in compiled pattern |
| <b>/f</b> set PCRE_FIRSTLINE |
| <b>/G</b> find all matches (shorten string) |
| <b>/g</b> find all matches (use startoffset) |
| <b>/I</b> show information about pattern |
| <b>/i</b> set PCRE_CASELESS |
| <b>/J</b> set PCRE_DUPNAMES |
| <b>/K</b> show backtracking control names |
| <b>/L</b> set locale |
| <b>/M</b> show compiled memory size |
| <b>/m</b> set PCRE_MULTILINE |
| <b>/N</b> set PCRE_NO_AUTO_CAPTURE |
| <b>/P</b> use the POSIX wrapper |
| <b>/S</b> study the pattern after compilation |
| <b>/s</b> set PCRE_DOTALL |
| <b>/T</b> select character tables |
| <b>/U</b> set PCRE_UNGREEDY |
| <b>/W</b> set PCRE_UCP |
| <b>/X</b> set PCRE_EXTRA |
| <b>/x</b> set PCRE_EXTENDED |
| <b>/Y</b> set PCRE_NO_START_OPTIMIZE |
| <b>/Z</b> don't show lengths in <b>/B</b> output |
| |
| <b>/<any></b> set PCRE_NEWLINE_ANY |
| <b>/<anycrlf></b> set PCRE_NEWLINE_ANYCRLF |
| <b>/<cr></b> set PCRE_NEWLINE_CR |
| <b>/<crlf></b> set PCRE_NEWLINE_CRLF |
| <b>/<lf></b> set PCRE_NEWLINE_LF |
| <b>/<bsr_anycrlf></b> set PCRE_BSR_ANYCRLF |
| <b>/<bsr_unicode></b> set PCRE_BSR_UNICODE |
| <b>/<JS></b> set PCRE_JAVASCRIPT_COMPAT |
| |
| </PRE> |
</P> |
</P> |
|
<br><b> |
|
Perl-compatible modifiers |
|
</b><br> |
<P> |
<P> |
The <b>/i</b>, <b>/m</b>, <b>/s</b>, and <b>/x</b> modifiers set the PCRE_CASELESS, |
The <b>/i</b>, <b>/m</b>, <b>/s</b>, and <b>/x</b> modifiers set the PCRE_CASELESS, |
PCRE_MULTILINE, PCRE_DOTALL, or PCRE_EXTENDED options, respectively, when |
PCRE_MULTILINE, PCRE_DOTALL, or PCRE_EXTENDED options, respectively, when |
<b>pcre[16]_compile()</b> is called. These four modifier letters have the same | <b>pcre[16|32]_compile()</b> is called. These four modifier letters have the same |
effect as they do in Perl. For example: |
effect as they do in Perl. For example: |
<pre> |
<pre> |
/caseless/i |
/caseless/i |
</pre> | |
| </PRE> |
| </P> |
| <br><b> |
| Modifiers for other PCRE options |
| </b><br> |
| <P> |
The following table shows additional modifiers for setting PCRE compile-time |
The following table shows additional modifiers for setting PCRE compile-time |
options that do not correspond to anything in Perl: |
options that do not correspond to anything in Perl: |
<pre> |
<pre> |
Line 298 options that do not correspond to anything in Perl:
|
Line 409 options that do not correspond to anything in Perl:
|
<b>/8</b> PCRE_UTF16 ) when using the 16-bit |
<b>/8</b> PCRE_UTF16 ) when using the 16-bit |
<b>/?</b> PCRE_NO_UTF16_CHECK ) library |
<b>/?</b> PCRE_NO_UTF16_CHECK ) library |
|
|
|
<b>/8</b> PCRE_UTF32 ) when using the 32-bit |
|
<b>/?</b> PCRE_NO_UTF32_CHECK ) library |
|
|
|
<b>/9</b> PCRE_NEVER_UTF |
<b>/A</b> PCRE_ANCHORED |
<b>/A</b> PCRE_ANCHORED |
<b>/C</b> PCRE_AUTO_CALLOUT |
<b>/C</b> PCRE_AUTO_CALLOUT |
<b>/E</b> PCRE_DOLLAR_ENDONLY |
<b>/E</b> PCRE_DOLLAR_ENDONLY |
Line 308 options that do not correspond to anything in Perl:
|
Line 423 options that do not correspond to anything in Perl:
|
<b>/W</b> PCRE_UCP |
<b>/W</b> PCRE_UCP |
<b>/X</b> PCRE_EXTRA |
<b>/X</b> PCRE_EXTRA |
<b>/Y</b> PCRE_NO_START_OPTIMIZE |
<b>/Y</b> PCRE_NO_START_OPTIMIZE |
<b>/<JS></b> PCRE_JAVASCRIPT_COMPAT | <b>/<any></b> PCRE_NEWLINE_ANY |
| <b>/<anycrlf></b> PCRE_NEWLINE_ANYCRLF |
<b>/<cr></b> PCRE_NEWLINE_CR |
<b>/<cr></b> PCRE_NEWLINE_CR |
<b>/<lf></b> PCRE_NEWLINE_LF |
|
<b>/<crlf></b> PCRE_NEWLINE_CRLF |
<b>/<crlf></b> PCRE_NEWLINE_CRLF |
<b>/<anycrlf></b> PCRE_NEWLINE_ANYCRLF | <b>/<lf></b> PCRE_NEWLINE_LF |
<b>/<any></b> PCRE_NEWLINE_ANY | |
<b>/<bsr_anycrlf></b> PCRE_BSR_ANYCRLF |
<b>/<bsr_anycrlf></b> PCRE_BSR_ANYCRLF |
<b>/<bsr_unicode></b> PCRE_BSR_UNICODE |
<b>/<bsr_unicode></b> PCRE_BSR_UNICODE |
|
<b>/<JS></b> PCRE_JAVASCRIPT_COMPAT |
</pre> |
</pre> |
The modifiers that are enclosed in angle brackets are literal strings as shown, |
The modifiers that are enclosed in angle brackets are literal strings as shown, |
including the angle brackets, but the letters within can be in either case. |
including the angle brackets, but the letters within can be in either case. |
Line 323 This example sets multiline matching with CRLF as the
|
Line 438 This example sets multiline matching with CRLF as the
|
<pre> |
<pre> |
/^abc/m<CRLF> |
/^abc/m<CRLF> |
</pre> |
</pre> |
As well as turning on the PCRE_UTF8/16 option, the <b>/8</b> modifier causes | As well as turning on the PCRE_UTF8/16/32 option, the <b>/8</b> modifier causes |
all non-printing characters in output strings to be printed using the |
all non-printing characters in output strings to be printed using the |
\x{hh...} notation. Otherwise, those less than 0x100 are output in hex without |
\x{hh...} notation. Otherwise, those less than 0x100 are output in hex without |
the curly brackets. |
the curly brackets. |
Line 341 Searching for all possible matches within each subject
|
Line 456 Searching for all possible matches within each subject
|
by the <b>/g</b> or <b>/G</b> modifier. After finding a match, PCRE is called |
by the <b>/g</b> or <b>/G</b> modifier. After finding a match, PCRE is called |
again to search the remainder of the subject string. The difference between |
again to search the remainder of the subject string. The difference between |
<b>/g</b> and <b>/G</b> is that the former uses the <i>startoffset</i> argument to |
<b>/g</b> and <b>/G</b> is that the former uses the <i>startoffset</i> argument to |
<b>pcre[16]_exec()</b> to start searching at a new point within the entire | <b>pcre[16|32]_exec()</b> to start searching at a new point within the entire |
string (which is in effect what Perl does), whereas the latter passes over a |
string (which is in effect what Perl does), whereas the latter passes over a |
shortened substring. This makes a difference to the matching process if the |
shortened substring. This makes a difference to the matching process if the |
pattern begins with a lookbehind assertion (including \b or \B). |
pattern begins with a lookbehind assertion (including \b or \B). |
</P> |
</P> |
<P> |
<P> |
If any call to <b>pcre[16]_exec()</b> in a <b>/g</b> or <b>/G</b> sequence matches | If any call to <b>pcre[16|32]_exec()</b> in a <b>/g</b> or <b>/G</b> sequence matches |
an empty string, the next call is done with the PCRE_NOTEMPTY_ATSTART and |
an empty string, the next call is done with the PCRE_NOTEMPTY_ATSTART and |
PCRE_ANCHORED flags set in order to search for another, non-empty, match at the |
PCRE_ANCHORED flags set in order to search for another, non-empty, match at the |
same point. If this second match fails, the start offset is advanced, and the |
same point. If this second match fails, the start offset is advanced, and the |
Line 378 modifier because /S+ and /S++ have other meanings.
|
Line 493 modifier because /S+ and /S++ have other meanings.
|
The <b>/=</b> modifier requests that the values of all potential captured |
The <b>/=</b> modifier requests that the values of all potential captured |
parentheses be output after a match. By default, only those up to the highest |
parentheses be output after a match. By default, only those up to the highest |
one actually used in the match are output (corresponding to the return code |
one actually used in the match are output (corresponding to the return code |
from <b>pcre[16]_exec()</b>). Values in the offsets vector corresponding to | from <b>pcre[16|32]_exec()</b>). Values in the offsets vector corresponding to |
higher numbers should be set to -1, and these are output as "<unset>". This |
higher numbers should be set to -1, and these are output as "<unset>". This |
modifier gives a way of checking that this is happening. |
modifier gives a way of checking that this is happening. |
</P> |
</P> |
Line 406 below.
|
Line 521 below.
|
<P> |
<P> |
The <b>/I</b> modifier requests that <b>pcretest</b> output information about the |
The <b>/I</b> modifier requests that <b>pcretest</b> output information about the |
compiled pattern (whether it is anchored, has a fixed first character, and |
compiled pattern (whether it is anchored, has a fixed first character, and |
so on). It does this by calling <b>pcre[16]_fullinfo()</b> after compiling a | so on). It does this by calling <b>pcre[16|32]_fullinfo()</b> after compiling a |
pattern. If the pattern is studied, the results of that are also output. |
pattern. If the pattern is studied, the results of that are also output. |
</P> |
</P> |
<P> |
<P> |
The <b>/K</b> modifier requests <b>pcretest</b> to show names from backtracking |
The <b>/K</b> modifier requests <b>pcretest</b> to show names from backtracking |
control verbs that are returned from calls to <b>pcre[16]_exec()</b>. It causes | control verbs that are returned from calls to <b>pcre[16|32]_exec()</b>. It causes |
<b>pcretest</b> to create a <b>pcre[16]_extra</b> block if one has not already | <b>pcretest</b> to create a <b>pcre[16|32]_extra</b> block if one has not already |
been created by a call to <b>pcre[16]_study()</b>, and to set the | been created by a call to <b>pcre[16|32]_study()</b>, and to set the |
PCRE_EXTRA_MARK flag and the <b>mark</b> field within it, every time that |
PCRE_EXTRA_MARK flag and the <b>mark</b> field within it, every time that |
<b>pcre[16]_exec()</b> is called. If the variable that the <b>mark</b> field | <b>pcre[16|32]_exec()</b> is called. If the variable that the <b>mark</b> field |
points to is non-NULL for a match, non-match, or partial match, <b>pcretest</b> |
points to is non-NULL for a match, non-match, or partial match, <b>pcretest</b> |
prints the string to which it points. For a match, this is shown on a line by |
prints the string to which it points. For a match, this is shown on a line by |
itself, tagged with "MK:". For a non-match it is added to the message. |
itself, tagged with "MK:". For a non-match it is added to the message. |
Line 427 example,
|
Line 542 example,
|
/pattern/Lfr_FR |
/pattern/Lfr_FR |
</pre> |
</pre> |
For this reason, it must be the last modifier. The given locale is set, |
For this reason, it must be the last modifier. The given locale is set, |
<b>pcre[16]_maketables()</b> is called to build a set of character tables for | <b>pcre[16|32]_maketables()</b> is called to build a set of character tables for |
the locale, and this is then passed to <b>pcre[16]_compile()</b> when compiling | the locale, and this is then passed to <b>pcre[16|32]_compile()</b> when compiling |
the regular expression. Without an <b>/L</b> (or <b>/T</b>) modifier, NULL is |
the regular expression. Without an <b>/L</b> (or <b>/T</b>) modifier, NULL is |
passed as the tables pointer; that is, <b>/L</b> applies only to the expression |
passed as the tables pointer; that is, <b>/L</b> applies only to the expression |
on which it appears. |
on which it appears. |
Line 436 on which it appears.
|
Line 551 on which it appears.
|
<P> |
<P> |
The <b>/M</b> modifier causes the size in bytes of the memory block used to hold |
The <b>/M</b> modifier causes the size in bytes of the memory block used to hold |
the compiled pattern to be output. This does not include the size of the |
the compiled pattern to be output. This does not include the size of the |
<b>pcre[16]</b> block; it is just the actual compiled data. If the pattern is | <b>pcre[16|32]</b> block; it is just the actual compiled data. If the pattern is |
successfully studied with the PCRE_STUDY_JIT_COMPILE option, the size of the |
successfully studied with the PCRE_STUDY_JIT_COMPILE option, the size of the |
JIT compiled code is also output. |
JIT compiled code is also output. |
</P> |
</P> |
<P> |
<P> |
If the <b>/S</b> modifier appears once, it causes <b>pcre[16]_study()</b> to be | The <b>/S</b> modifier causes <b>pcre[16|32]_study()</b> to be called after the |
called after the expression has been compiled, and the results used when the | expression has been compiled, and the results used when the expression is |
expression is matched. If <b>/S</b> appears twice, it suppresses studying, even | matched. There are a number of qualifying characters that may follow <b>/S</b>. |
| They may appear in any order. |
| </P> |
| <P> |
| If <b>S</b> is followed by an exclamation mark, <b>pcre[16|32]_study()</b> is called |
| with the PCRE_STUDY_EXTRA_NEEDED option, causing it always to return a |
| <b>pcre_extra</b> block, even when studying discovers no useful information. |
| </P> |
| <P> |
| If <b>/S</b> is followed by a second S character, it suppresses studying, even |
if it was requested externally by the <b>-s</b> command line option. This makes |
if it was requested externally by the <b>-s</b> command line option. This makes |
it possible to specify that certain patterns are always studied, and others are |
it possible to specify that certain patterns are always studied, and others are |
never studied, independently of <b>-s</b>. This feature is used in the test |
never studied, independently of <b>-s</b>. This feature is used in the test |
files in a few cases where the output is different when the pattern is studied. |
files in a few cases where the output is different when the pattern is studied. |
</P> |
</P> |
<P> |
<P> |
If the <b>/S</b> modifier is immediately followed by a + character, the call to | If the <b>/S</b> modifier is followed by a + character, the call to |
<b>pcre[16]_study()</b> is made with all the JIT study options, requesting | <b>pcre[16|32]_study()</b> is made with all the JIT study options, requesting |
just-in-time optimization support if it is available, for both normal and |
just-in-time optimization support if it is available, for both normal and |
partial matching. If you want to restrict the JIT compiling modes, you can |
partial matching. If you want to restrict the JIT compiling modes, you can |
follow <b>/S+</b> with a digit in the range 1 to 7: |
follow <b>/S+</b> with a digit in the range 1 to 7: |
Line 473 immediately after <b>/S</b> or <b>/S+</b> because this
|
Line 597 immediately after <b>/S</b> or <b>/S+</b> because this
|
</P> |
</P> |
<P> |
<P> |
If JIT studying is successful, the compiled JIT code will automatically be used |
If JIT studying is successful, the compiled JIT code will automatically be used |
when <b>pcre[16]_exec()</b> is run, except when incompatible run-time options | when <b>pcre[16|32]_exec()</b> is run, except when incompatible run-time options |
are specified. For more details, see the |
are specified. For more details, see the |
<a href="pcrejit.html"><b>pcrejit</b></a> |
<a href="pcrejit.html"><b>pcrejit</b></a> |
documentation. See also the <b>\J</b> escape sequence below for a way of |
documentation. See also the <b>\J</b> escape sequence below for a way of |
setting the size of the JIT stack. |
setting the size of the JIT stack. |
</P> |
</P> |
<P> |
<P> |
|
Finally, if <b>/S</b> is followed by a minus character, JIT compilation is |
|
suppressed, even if it was requested externally by the <b>-s</b> command line |
|
option. This makes it possible to specify that JIT is never to be used for |
|
certain patterns. |
|
</P> |
|
<P> |
The <b>/T</b> modifier must be followed by a single digit. It causes a specific |
The <b>/T</b> modifier must be followed by a single digit. It causes a specific |
set of built-in character tables to be passed to <b>pcre[16]_compile()</b>. It | set of built-in character tables to be passed to <b>pcre[16|32]_compile()</b>. It |
is used in the standard PCRE tests to check behaviour with different character |
is used in the standard PCRE tests to check behaviour with different character |
tables. The digit specifies the tables as follows: |
tables. The digit specifies the tables as follows: |
<pre> |
<pre> |
Line 512 function:
|
Line 642 function:
|
The <b>/+</b> modifier works as described above. All other modifiers are |
The <b>/+</b> modifier works as described above. All other modifiers are |
ignored. |
ignored. |
</P> |
</P> |
<br><a name="SEC6" href="#TOC1">DATA LINES</a><br> | <br><a name="SEC7" href="#TOC1">DATA LINES</a><br> |
<P> |
<P> |
Before each data line is passed to <b>pcre[16]_exec()</b>, leading and trailing | Before each data line is passed to <b>pcre[16|32]_exec()</b>, leading and trailing |
white space is removed, and it is then scanned for \ escapes. Some of these |
white space is removed, and it is then scanned for \ escapes. Some of these |
are pretty esoteric features, intended for checking out some of the more |
are pretty esoteric features, intended for checking out some of the more |
complicated features of PCRE. If you are just testing "ordinary" regular |
complicated features of PCRE. If you are just testing "ordinary" regular |
Line 531 recognized:
|
Line 661 recognized:
|
\t tab (\x09) |
\t tab (\x09) |
\v vertical tab (\x0b) |
\v vertical tab (\x0b) |
\nnn octal character (up to 3 octal digits); always |
\nnn octal character (up to 3 octal digits); always |
a byte unless > 255 in UTF-8 or 16-bit mode | a byte unless > 255 in UTF-8 or 16-bit or 32-bit mode |
\xhh hexadecimal byte (up to 2 hex digits) |
\xhh hexadecimal byte (up to 2 hex digits) |
\x{hh...} hexadecimal character (any number of hex digits) |
\x{hh...} hexadecimal character (any number of hex digits) |
\A pass the PCRE_ANCHORED option to <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b> | \A pass the PCRE_ANCHORED option to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> |
\B pass the PCRE_NOTBOL option to <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b> | \B pass the PCRE_NOTBOL option to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> |
\Cdd call pcre[16]_copy_substring() for substring dd after a successful match (number less than 32) | \Cdd call pcre[16|32]_copy_substring() for substring dd after a successful match (number less than 32) |
\Cname call pcre[16]_copy_named_substring() for substring "name" after a successful match (name termin- | \Cname call pcre[16|32]_copy_named_substring() for substring "name" after a successful match (name termin- |
ated by next non alphanumeric character) |
ated by next non alphanumeric character) |
\C+ show the current captured substrings at callout time |
\C+ show the current captured substrings at callout time |
\C- do not supply a callout function |
\C- do not supply a callout function |
\C!n return 1 instead of 0 when callout number n is reached |
\C!n return 1 instead of 0 when callout number n is reached |
\C!n!m return 1 instead of 0 when callout number n is reached for the nth time |
\C!n!m return 1 instead of 0 when callout number n is reached for the nth time |
\C*n pass the number n (may be negative) as callout data; this is used as the callout return value |
\C*n pass the number n (may be negative) as callout data; this is used as the callout return value |
\D use the <b>pcre[16]_dfa_exec()</b> match function | \D use the <b>pcre[16|32]_dfa_exec()</b> match function |
\F only shortest match for <b>pcre[16]_dfa_exec()</b> | \F only shortest match for <b>pcre[16|32]_dfa_exec()</b> |
\Gdd call pcre[16]_get_substring() for substring dd after a successful match (number less than 32) | \Gdd call pcre[16|32]_get_substring() for substring dd after a successful match (number less than 32) |
\Gname call pcre[16]_get_named_substring() for substring "name" after a successful match (name termin- | \Gname call pcre[16|32]_get_named_substring() for substring "name" after a successful match (name termin- |
ated by next non-alphanumeric character) |
ated by next non-alphanumeric character) |
\Jdd set up a JIT stack of dd kilobytes maximum (any number of digits) |
\Jdd set up a JIT stack of dd kilobytes maximum (any number of digits) |
\L call pcre[16]_get_substringlist() after a successful match | \L call pcre[16|32]_get_substringlist() after a successful match |
\M discover the minimum MATCH_LIMIT and MATCH_LIMIT_RECURSION settings |
\M discover the minimum MATCH_LIMIT and MATCH_LIMIT_RECURSION settings |
\N pass the PCRE_NOTEMPTY option to <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b>; if used twice, pass the | \N pass the PCRE_NOTEMPTY option to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b>; if used twice, pass the |
PCRE_NOTEMPTY_ATSTART option |
PCRE_NOTEMPTY_ATSTART option |
\Odd set the size of the output vector passed to <b>pcre[16]_exec()</b> to dd (any number of digits) | \Odd set the size of the output vector passed to <b>pcre[16|32]_exec()</b> to dd (any number of digits) |
\P pass the PCRE_PARTIAL_SOFT option to <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b>; if used twice, pass the | \P pass the PCRE_PARTIAL_SOFT option to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b>; if used twice, pass the |
PCRE_PARTIAL_HARD option |
PCRE_PARTIAL_HARD option |
\Qdd set the PCRE_MATCH_LIMIT_RECURSION limit to dd (any number of digits) |
\Qdd set the PCRE_MATCH_LIMIT_RECURSION limit to dd (any number of digits) |
\R pass the PCRE_DFA_RESTART option to <b>pcre[16]_dfa_exec()</b> | \R pass the PCRE_DFA_RESTART option to <b>pcre[16|32]_dfa_exec()</b> |
\S output details of memory get/free calls during matching |
\S output details of memory get/free calls during matching |
\Y pass the PCRE_NO_START_OPTIMIZE option to <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b> | \Y pass the PCRE_NO_START_OPTIMIZE option to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> |
\Z pass the PCRE_NOTEOL option to <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b> | \Z pass the PCRE_NOTEOL option to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> |
\? pass the PCRE_NO_UTF[8|16]_CHECK option to <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b> | \? pass the PCRE_NO_UTF[8|16|32]_CHECK option to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> |
\>dd start the match at offset dd (optional "-"; then any number of digits); this sets the <i>startoffset</i> |
\>dd start the match at offset dd (optional "-"; then any number of digits); this sets the <i>startoffset</i> |
argument for <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b> | argument for <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> |
\<cr> pass the PCRE_NEWLINE_CR option to <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b> | \<cr> pass the PCRE_NEWLINE_CR option to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> |
\<lf> pass the PCRE_NEWLINE_LF option to <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b> | \<lf> pass the PCRE_NEWLINE_LF option to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> |
\<crlf> pass the PCRE_NEWLINE_CRLF option to <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b> | \<crlf> pass the PCRE_NEWLINE_CRLF option to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> |
\<anycrlf> pass the PCRE_NEWLINE_ANYCRLF option to <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b> | \<anycrlf> pass the PCRE_NEWLINE_ANYCRLF option to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> |
\<any> pass the PCRE_NEWLINE_ANY option to <b>pcre[16]_exec()</b> or <b>pcre[16]_dfa_exec()</b> | \<any> pass the PCRE_NEWLINE_ANY option to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> |
</pre> |
</pre> |
The use of \x{hh...} is not dependent on the use of the <b>/8</b> modifier on |
The use of \x{hh...} is not dependent on the use of the <b>/8</b> modifier on |
the pattern. It is recognized always. There may be any number of hexadecimal |
the pattern. It is recognized always. There may be any number of hexadecimal |
Line 588 In UTF-16 mode, all 4-digit \x{hhhh} values are accept
|
Line 718 In UTF-16 mode, all 4-digit \x{hhhh} values are accept
|
possible to construct invalid UTF-16 sequences for testing purposes. |
possible to construct invalid UTF-16 sequences for testing purposes. |
</P> |
</P> |
<P> |
<P> |
|
In UTF-32 mode, all 4- to 8-digit \x{...} values are accepted. This makes it |
|
possible to construct invalid UTF-32 sequences for testing purposes. |
|
</P> |
|
<P> |
The escapes that specify line ending sequences are literal strings, exactly as |
The escapes that specify line ending sequences are literal strings, exactly as |
shown. No more than one newline setting should be present in any data line. |
shown. No more than one newline setting should be present in any data line. |
</P> |
</P> |
Line 604 is not being used. Providing a stack that is larger th
|
Line 738 is not being used. Providing a stack that is larger th
|
necessary only for very complicated patterns. |
necessary only for very complicated patterns. |
</P> |
</P> |
<P> |
<P> |
If \M is present, <b>pcretest</b> calls <b>pcre[16]_exec()</b> several times, | If \M is present, <b>pcretest</b> calls <b>pcre[16|32]_exec()</b> several times, |
with different values in the <i>match_limit</i> and <i>match_limit_recursion</i> |
with different values in the <i>match_limit</i> and <i>match_limit_recursion</i> |
fields of the <b>pcre[16]_extra</b> data structure, until it finds the minimum | fields of the <b>pcre[16|32]_extra</b> data structure, until it finds the minimum |
numbers for each parameter that allow <b>pcre[16]_exec()</b> to complete without | numbers for each parameter that allow <b>pcre[16|32]_exec()</b> to complete without |
error. Because this is testing a specific feature of the normal interpretive |
error. Because this is testing a specific feature of the normal interpretive |
<b>pcre[16]_exec()</b> execution, the use of any JIT optimization that might | <b>pcre[16|32]_exec()</b> execution, the use of any JIT optimization that might |
have been set up by the <b>/S+</b> qualifier of <b>-s+</b> option is disabled. |
have been set up by the <b>/S+</b> qualifier of <b>-s+</b> option is disabled. |
</P> |
</P> |
<P> |
<P> |
Line 624 needed to complete the match attempt.
|
Line 758 needed to complete the match attempt.
|
<P> |
<P> |
When \O is used, the value specified may be higher or lower than the size set |
When \O is used, the value specified may be higher or lower than the size set |
by the <b>-O</b> command line option (or defaulted to 45); \O applies only to |
by the <b>-O</b> command line option (or defaulted to 45); \O applies only to |
the call of <b>pcre[16]_exec()</b> for the line in which it appears. | the call of <b>pcre[16|32]_exec()</b> for the line in which it appears. |
</P> |
</P> |
<P> |
<P> |
If the <b>/P</b> modifier was present on the pattern, causing the POSIX wrapper |
If the <b>/P</b> modifier was present on the pattern, causing the POSIX wrapper |
Line 632 API to be used, the only option-setting sequences that
|
Line 766 API to be used, the only option-setting sequences that
|
\N, and \Z, causing REG_NOTBOL, REG_NOTEMPTY, and REG_NOTEOL, respectively, |
\N, and \Z, causing REG_NOTBOL, REG_NOTEMPTY, and REG_NOTEOL, respectively, |
to be passed to <b>regexec()</b>. |
to be passed to <b>regexec()</b>. |
</P> |
</P> |
<br><a name="SEC7" href="#TOC1">THE ALTERNATIVE MATCHING FUNCTION</a><br> | <br><a name="SEC8" href="#TOC1">THE ALTERNATIVE MATCHING FUNCTION</a><br> |
<P> |
<P> |
By default, <b>pcretest</b> uses the standard PCRE matching function, |
By default, <b>pcretest</b> uses the standard PCRE matching function, |
<b>pcre[16]_exec()</b> to match each data line. PCRE also supports an | <b>pcre[16|32]_exec()</b> to match each data line. PCRE also supports an |
alternative matching function, <b>pcre[16]_dfa_test()</b>, which operates in a | alternative matching function, <b>pcre[16|32]_dfa_test()</b>, which operates in a |
different way, and has some restrictions. The differences between the two |
different way, and has some restrictions. The differences between the two |
functions are described in the |
functions are described in the |
<a href="pcrematching.html"><b>pcrematching</b></a> |
<a href="pcrematching.html"><b>pcrematching</b></a> |
Line 649 This function finds all possible matches at a given po
|
Line 783 This function finds all possible matches at a given po
|
escape sequence is present in the data line, it stops after the first match is |
escape sequence is present in the data line, it stops after the first match is |
found. This is always the shortest possible match. |
found. This is always the shortest possible match. |
</P> |
</P> |
<br><a name="SEC8" href="#TOC1">DEFAULT OUTPUT FROM PCRETEST</a><br> | <br><a name="SEC9" href="#TOC1">DEFAULT OUTPUT FROM PCRETEST</a><br> |
<P> |
<P> |
This section describes the output when the normal matching function, |
This section describes the output when the normal matching function, |
<b>pcre[16]_exec()</b>, is being used. | <b>pcre[16|32]_exec()</b>, is being used. |
</P> |
</P> |
<P> |
<P> |
When a match succeeds, <b>pcretest</b> outputs the list of captured substrings |
When a match succeeds, <b>pcretest</b> outputs the list of captured substrings |
that <b>pcre[16]_exec()</b> returns, starting with number 0 for the string that | that <b>pcre[16|32]_exec()</b> returns, starting with number 0 for the string that |
matched the whole pattern. Otherwise, it outputs "No match" when the return is |
matched the whole pattern. Otherwise, it outputs "No match" when the return is |
PCRE_ERROR_NOMATCH, and "Partial match:" followed by the partially matching |
PCRE_ERROR_NOMATCH, and "Partial match:" followed by the partially matching |
substring when <b>pcre[16]_exec()</b> returns PCRE_ERROR_PARTIAL. (Note that | substring when <b>pcre[16|32]_exec()</b> returns PCRE_ERROR_PARTIAL. (Note that |
this is the entire substring that was inspected during the partial match; it |
this is the entire substring that was inspected during the partial match; it |
may include characters before the actual match start if a lookbehind assertion, |
may include characters before the actual match start if a lookbehind assertion, |
\K, \b, or \B was involved.) For any other return, <b>pcretest</b> outputs |
\K, \b, or \B was involved.) For any other return, <b>pcretest</b> outputs |
Line 679 at least two. Here is an example of an interactive <b>
|
Line 813 at least two. Here is an example of an interactive <b>
|
No match |
No match |
</pre> |
</pre> |
Unset capturing substrings that are not followed by one that is set are not |
Unset capturing substrings that are not followed by one that is set are not |
returned by <b>pcre[16]_exec()</b>, and are not shown by <b>pcretest</b>. In the | returned by <b>pcre[16|32]_exec()</b>, and are not shown by <b>pcretest</b>. In the |
following example, there are two capturing substrings, but when the first data |
following example, there are two capturing substrings, but when the first data |
line is matched, the second, unset substring is not shown. An "internal" unset |
line is matched, the second, unset substring is not shown. An "internal" unset |
substring is shown as "<unset>", as for the second data line. |
substring is shown as "<unset>", as for the second data line. |
Line 740 prompt is used for continuations), data lines may not.
|
Line 874 prompt is used for continuations), data lines may not.
|
included in data by means of the \n escape (or \r, \r\n, etc., depending on |
included in data by means of the \n escape (or \r, \r\n, etc., depending on |
the newline sequence setting). |
the newline sequence setting). |
</P> |
</P> |
<br><a name="SEC9" href="#TOC1">OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION</a><br> | <br><a name="SEC10" href="#TOC1">OUTPUT FROM THE ALTERNATIVE MATCHING FUNCTION</a><br> |
<P> |
<P> |
When the alternative matching function, <b>pcre[16]_dfa_exec()</b>, is used (by | When the alternative matching function, <b>pcre[16|32]_dfa_exec()</b>, is used (by |
means of the \D escape sequence or the <b>-dfa</b> command line option), the |
means of the \D escape sequence or the <b>-dfa</b> command line option), the |
output consists of a list of all the matches that start at the first point in |
output consists of a list of all the matches that start at the first point in |
the subject where there is at least one match. For example: |
the subject where there is at least one match. For example: |
Line 776 at the end of the longest match. For example:
|
Line 910 at the end of the longest match. For example:
|
Since the matching function does not support substring capture, the escape |
Since the matching function does not support substring capture, the escape |
sequences that are concerned with captured substrings are not relevant. |
sequences that are concerned with captured substrings are not relevant. |
</P> |
</P> |
<br><a name="SEC10" href="#TOC1">RESTARTING AFTER A PARTIAL MATCH</a><br> | <br><a name="SEC11" href="#TOC1">RESTARTING AFTER A PARTIAL MATCH</a><br> |
<P> |
<P> |
When the alternative matching function has given the PCRE_ERROR_PARTIAL return, |
When the alternative matching function has given the PCRE_ERROR_PARTIAL return, |
indicating that the subject partially matched the pattern, you can restart the |
indicating that the subject partially matched the pattern, you can restart the |
Line 793 For further information about partial matching, see th
|
Line 927 For further information about partial matching, see th
|
<a href="pcrepartial.html"><b>pcrepartial</b></a> |
<a href="pcrepartial.html"><b>pcrepartial</b></a> |
documentation. |
documentation. |
</P> |
</P> |
<br><a name="SEC11" href="#TOC1">CALLOUTS</a><br> | <br><a name="SEC12" href="#TOC1">CALLOUTS</a><br> |
<P> |
<P> |
If the pattern contains any callout requests, <b>pcretest</b>'s callout function |
If the pattern contains any callout requests, <b>pcretest</b>'s callout function |
is called during matching. This works with both matching functions. By default, |
is called during matching. This works with both matching functions. By default, |
Line 854 the
|
Line 988 the
|
<a href="pcrecallout.html"><b>pcrecallout</b></a> |
<a href="pcrecallout.html"><b>pcrecallout</b></a> |
documentation. |
documentation. |
</P> |
</P> |
<br><a name="SEC12" href="#TOC1">NON-PRINTING CHARACTERS</a><br> | <br><a name="SEC13" href="#TOC1">NON-PRINTING CHARACTERS</a><br> |
<P> |
<P> |
When <b>pcretest</b> is outputting text in the compiled version of a pattern, |
When <b>pcretest</b> is outputting text in the compiled version of a pattern, |
bytes other than 32-126 are always treated as non-printing characters are are |
bytes other than 32-126 are always treated as non-printing characters are are |
Line 866 string, it behaves in the same way, unless a different
|
Line 1000 string, it behaves in the same way, unless a different
|
the pattern (using the <b>/L</b> modifier). In this case, the <b>isprint()</b> |
the pattern (using the <b>/L</b> modifier). In this case, the <b>isprint()</b> |
function to distinguish printing and non-printing characters. |
function to distinguish printing and non-printing characters. |
</P> |
</P> |
<br><a name="SEC13" href="#TOC1">SAVING AND RELOADING COMPILED PATTERNS</a><br> | <br><a name="SEC14" href="#TOC1">SAVING AND RELOADING COMPILED PATTERNS</a><br> |
<P> |
<P> |
The facilities described in this section are not available when the POSIX |
The facilities described in this section are not available when the POSIX |
interface to PCRE is being used, that is, when the <b>/P</b> pattern modifier is |
interface to PCRE is being used, that is, when the <b>/P</b> pattern modifier is |
Line 939 string using a reloaded pattern is likely to cause <b>
|
Line 1073 string using a reloaded pattern is likely to cause <b>
|
Finally, if you attempt to load a file that is not in the correct format, the |
Finally, if you attempt to load a file that is not in the correct format, the |
result is undefined. |
result is undefined. |
</P> |
</P> |
<br><a name="SEC14" href="#TOC1">SEE ALSO</a><br> | <br><a name="SEC15" href="#TOC1">SEE ALSO</a><br> |
<P> |
<P> |
<b>pcre</b>(3), <b>pcre16</b>(3), <b>pcreapi</b>(3), <b>pcrecallout</b>(3), | <b>pcre</b>(3), <b>pcre16</b>(3), <b>pcre32</b>(3), <b>pcreapi</b>(3), |
| <b>pcrecallout</b>(3), |
<b>pcrejit</b>, <b>pcrematching</b>(3), <b>pcrepartial</b>(d), |
<b>pcrejit</b>, <b>pcrematching</b>(3), <b>pcrepartial</b>(d), |
<b>pcrepattern</b>(3), <b>pcreprecompile</b>(3). |
<b>pcrepattern</b>(3), <b>pcreprecompile</b>(3). |
</P> |
</P> |
<br><a name="SEC15" href="#TOC1">AUTHOR</a><br> | <br><a name="SEC16" href="#TOC1">AUTHOR</a><br> |
<P> |
<P> |
Philip Hazel |
Philip Hazel |
<br> |
<br> |
Line 954 University Computing Service
|
Line 1089 University Computing Service
|
Cambridge CB2 3QH, England. |
Cambridge CB2 3QH, England. |
<br> |
<br> |
</P> |
</P> |
<br><a name="SEC16" href="#TOC1">REVISION</a><br> | <br><a name="SEC17" href="#TOC1">REVISION</a><br> |
<P> |
<P> |
Last updated: 21 February 2012 | Last updated: 26 April 2013 |
<br> |
<br> |
Copyright © 1997-2012 University of Cambridge. | Copyright © 1997-2013 University of Cambridge. |
<br> |
<br> |
<p> |
<p> |
Return to the <a href="index.html">PCRE index page</a>. |
Return to the <a href="index.html">PCRE index page</a>. |