Annotation of embedaddon/pcre/doc/html/pcrecallout.html, revision 1.1.1.3

1.1       misho       1: <html>
                      2: <head>
                      3: <title>pcrecallout specification</title>
                      4: </head>
                      5: <body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
                      6: <h1>pcrecallout man page</h1>
                      7: <p>
                      8: Return to the <a href="index.html">PCRE index page</a>.
                      9: </p>
                     10: <p>
                     11: This page is part of the PCRE HTML documentation. It was generated automatically
                     12: from the original man page. If there is any nonsense in it, please consult the
                     13: man page, in case the conversion went wrong.
                     14: <br>
                     15: <ul>
1.1.1.3 ! misho      16: <li><a name="TOC1" href="#SEC1">SYNOPSIS</a>
        !            17: <li><a name="TOC2" href="#SEC2">DESCRIPTION</a>
        !            18: <li><a name="TOC3" href="#SEC3">MISSING CALLOUTS</a>
        !            19: <li><a name="TOC4" href="#SEC4">THE CALLOUT INTERFACE</a>
        !            20: <li><a name="TOC5" href="#SEC5">RETURN VALUES</a>
        !            21: <li><a name="TOC6" href="#SEC6">AUTHOR</a>
        !            22: <li><a name="TOC7" href="#SEC7">REVISION</a>
1.1       misho      23: </ul>
1.1.1.3 ! misho      24: <br><a name="SEC1" href="#TOC1">SYNOPSIS</a><br>
        !            25: <P>
        !            26: <b>#include &#60;pcre.h&#62;</b>
        !            27: </P>
1.1       misho      28: <P>
                     29: <b>int (*pcre_callout)(pcre_callout_block *);</b>
                     30: </P>
                     31: <P>
1.1.1.2   misho      32: <b>int (*pcre16_callout)(pcre16_callout_block *);</b>
                     33: </P>
                     34: <P>
1.1.1.3 ! misho      35: <b>int (*pcre32_callout)(pcre32_callout_block *);</b>
        !            36: </P>
        !            37: <br><a name="SEC2" href="#TOC1">DESCRIPTION</a><br>
        !            38: <P>
1.1       misho      39: PCRE provides a feature called "callout", which is a means of temporarily
                     40: passing control to the caller of PCRE in the middle of pattern matching. The
                     41: caller of PCRE provides an external function by putting its entry point in the
1.1.1.2   misho      42: global variable <i>pcre_callout</i> (<i>pcre16_callout</i> for the 16-bit
1.1.1.3 ! misho      43: library, <i>pcre32_callout</i> for the 32-bit library). By default, this
        !            44: variable contains NULL, which disables all calling out.
1.1       misho      45: </P>
                     46: <P>
                     47: Within a regular expression, (?C) indicates the points at which the external
                     48: function is to be called. Different callout points can be identified by putting
                     49: a number less than 256 after the letter C. The default value is zero.
                     50: For example, this pattern has two callout points:
                     51: <pre>
                     52:   (?C1)abc(?C2)def
                     53: </pre>
1.1.1.2   misho      54: If the PCRE_AUTO_CALLOUT option bit is set when a pattern is compiled, PCRE
                     55: automatically inserts callouts, all with number 255, before each item in the
                     56: pattern. For example, if PCRE_AUTO_CALLOUT is used with the pattern
1.1       misho      57: <pre>
                     58:   A(\d{2}|--)
                     59: </pre>
                     60: it is processed as if it were
                     61: <br>
                     62: <br>
                     63: (?C255)A(?C255)((?C255)\d{2}(?C255)|(?C255)-(?C255)-(?C255))(?C255)
                     64: <br>
                     65: <br>
                     66: Notice that there is a callout before and after each parenthesis and
1.1.1.3 ! misho      67: alternation bar. If the pattern contains a conditional group whose condition is
        !            68: an assertion, an automatic callout is inserted immediately before the
        !            69: condition. Such a callout may also be inserted explicitly, for example:
        !            70: <pre>
        !            71:   (?(?C9)(?=a)ab|de)
        !            72: </pre>
        !            73: This applies only to assertion conditions (because they are themselves
        !            74: independent groups).
        !            75: </P>
        !            76: <P>
        !            77: Automatic callouts can be used for tracking the progress of pattern matching.
        !            78: The
1.1       misho      79: <a href="pcretest.html"><b>pcretest</b></a>
                     80: command has an option that sets automatic callouts; when it is used, the output
                     81: indicates how the pattern is matched. This is useful information when you are
                     82: trying to optimize the performance of a particular pattern.
                     83: </P>
1.1.1.3 ! misho      84: <br><a name="SEC3" href="#TOC1">MISSING CALLOUTS</a><br>
1.1       misho      85: <P>
                     86: You should be aware that, because of optimizations in the way PCRE matches
                     87: patterns by default, callouts sometimes do not happen. For example, if the
                     88: pattern is
                     89: <pre>
                     90:   ab(?C4)cd
                     91: </pre>
                     92: PCRE knows that any matching string must contain the letter "d". If the subject
                     93: string is "abyz", the lack of "d" means that matching doesn't ever start, and
                     94: the callout is never reached. However, with "abyd", though the result is still
                     95: no match, the callout is obeyed.
                     96: </P>
                     97: <P>
                     98: If the pattern is studied, PCRE knows the minimum length of a matching string,
                     99: and will immediately give a "no match" return without actually running a match
                    100: if the subject is not long enough, or, for unanchored patterns, if it has
                    101: been scanned far enough.
                    102: </P>
                    103: <P>
                    104: You can disable these optimizations by passing the PCRE_NO_START_OPTIMIZE
1.1.1.2   misho     105: option to the matching function, or by starting the pattern with
                    106: (*NO_START_OPT). This slows down the matching process, but does ensure that
                    107: callouts such as the example above are obeyed.
1.1       misho     108: </P>
1.1.1.3 ! misho     109: <br><a name="SEC4" href="#TOC1">THE CALLOUT INTERFACE</a><br>
1.1       misho     110: <P>
                    111: During matching, when PCRE reaches a callout point, the external function
1.1.1.3 ! misho     112: defined by <i>pcre_callout</i> or <i>pcre[16|32]_callout</i> is called
        !           113: (if it is set). This applies to both normal and DFA matching. The only
        !           114: argument to the callout function is a pointer to a <b>pcre_callout</b>
        !           115: or <b>pcre[16|32]_callout</b> block.
1.1.1.2   misho     116: These structures contains the following fields:
1.1       misho     117: <pre>
1.1.1.2   misho     118:   int           <i>version</i>;
                    119:   int           <i>callout_number</i>;
                    120:   int          *<i>offset_vector</i>;
                    121:   const char   *<i>subject</i>;           (8-bit version)
                    122:   PCRE_SPTR16   <i>subject</i>;           (16-bit version)
1.1.1.3 ! misho     123:   PCRE_SPTR32   <i>subject</i>;           (32-bit version)
1.1.1.2   misho     124:   int           <i>subject_length</i>;
                    125:   int           <i>start_match</i>;
                    126:   int           <i>current_position</i>;
                    127:   int           <i>capture_top</i>;
                    128:   int           <i>capture_last</i>;
                    129:   void         *<i>callout_data</i>;
                    130:   int           <i>pattern_position</i>;
                    131:   int           <i>next_item_length</i>;
                    132:   const unsigned char *<i>mark</i>;       (8-bit version)
                    133:   const PCRE_UCHAR16  *<i>mark</i>;       (16-bit version)
1.1.1.3 ! misho     134:   const PCRE_UCHAR32  *<i>mark</i>;       (32-bit version)
1.1       misho     135: </pre>
                    136: The <i>version</i> field is an integer containing the version number of the
                    137: block format. The initial version was 0; the current version is 2. The version
                    138: number will change again in future if additional fields are added, but the
                    139: intention is never to remove any of the existing fields.
                    140: </P>
                    141: <P>
                    142: The <i>callout_number</i> field contains the number of the callout, as compiled
                    143: into the pattern (that is, the number after ?C for manual callouts, and 255 for
                    144: automatically generated callouts).
                    145: </P>
                    146: <P>
                    147: The <i>offset_vector</i> field is a pointer to the vector of offsets that was
1.1.1.2   misho     148: passed by the caller to the matching function. When <b>pcre_exec()</b> or
1.1.1.3 ! misho     149: <b>pcre[16|32]_exec()</b> is used, the contents can be inspected, in order to
        !           150: extract substrings that have been matched so far, in the same way as for
        !           151: extracting substrings after a match has completed. For the DFA matching
        !           152: functions, this field is not useful.
1.1       misho     153: </P>
                    154: <P>
                    155: The <i>subject</i> and <i>subject_length</i> fields contain copies of the values
1.1.1.2   misho     156: that were passed to the matching function.
1.1       misho     157: </P>
                    158: <P>
                    159: The <i>start_match</i> field normally contains the offset within the subject at
                    160: which the current match attempt started. However, if the escape sequence \K
                    161: has been encountered, this value is changed to reflect the modified starting
                    162: point. If the pattern is not anchored, the callout function may be called
                    163: several times from the same point in the pattern for different starting points
                    164: in the subject.
                    165: </P>
                    166: <P>
                    167: The <i>current_position</i> field contains the offset within the subject of the
                    168: current match pointer.
                    169: </P>
                    170: <P>
1.1.1.3 ! misho     171: When the <b>pcre_exec()</b> or <b>pcre[16|32]_exec()</b> is used, the
1.1.1.2   misho     172: <i>capture_top</i> field contains one more than the number of the highest
                    173: numbered captured substring so far. If no substrings have been captured, the
                    174: value of <i>capture_top</i> is one. This is always the case when the DFA
                    175: functions are used, because they do not support captured substrings.
1.1       misho     176: </P>
                    177: <P>
                    178: The <i>capture_last</i> field contains the number of the most recently captured
1.1.1.3 ! misho     179: substring. However, when a recursion exits, the value reverts to what it was
        !           180: outside the recursion, as do the values of all captured substrings. If no
        !           181: substrings have been captured, the value of <i>capture_last</i> is -1. This is
        !           182: always the case for the DFA matching functions.
1.1       misho     183: </P>
                    184: <P>
1.1.1.2   misho     185: The <i>callout_data</i> field contains a value that is passed to a matching
                    186: function specifically so that it can be passed back in callouts. It is passed
1.1.1.3 ! misho     187: in the <i>callout_data</i> field of a <b>pcre_extra</b> or <b>pcre[16|32]_extra</b>
1.1.1.2   misho     188: data structure. If no such data was passed, the value of <i>callout_data</i> in
                    189: a callout block is NULL. There is a description of the <b>pcre_extra</b>
                    190: structure in the
1.1       misho     191: <a href="pcreapi.html"><b>pcreapi</b></a>
                    192: documentation.
                    193: </P>
                    194: <P>
1.1.1.2   misho     195: The <i>pattern_position</i> field is present from version 1 of the callout
                    196: structure. It contains the offset to the next item to be matched in the pattern
                    197: string.
1.1       misho     198: </P>
                    199: <P>
1.1.1.2   misho     200: The <i>next_item_length</i> field is present from version 1 of the callout
                    201: structure. It contains the length of the next item to be matched in the pattern
                    202: string. When the callout immediately precedes an alternation bar, a closing
                    203: parenthesis, or the end of the pattern, the length is zero. When the callout
                    204: precedes an opening parenthesis, the length is that of the entire subpattern.
1.1       misho     205: </P>
                    206: <P>
                    207: The <i>pattern_position</i> and <i>next_item_length</i> fields are intended to
                    208: help in distinguishing between different automatic callouts, which all have the
                    209: same callout number. However, they are set for all callouts.
                    210: </P>
                    211: <P>
1.1.1.2   misho     212: The <i>mark</i> field is present from version 2 of the callout structure. In
1.1.1.3 ! misho     213: callouts from <b>pcre_exec()</b> or <b>pcre[16|32]_exec()</b> it contains a
        !           214: pointer to the zero-terminated name of the most recently passed (*MARK),
        !           215: (*PRUNE), or (*THEN) item in the match, or NULL if no such items have been
        !           216: passed. Instances of (*PRUNE) or (*THEN) without a name do not obliterate a
        !           217: previous (*MARK). In callouts from the DFA matching functions this field always
        !           218: contains NULL.
1.1       misho     219: </P>
1.1.1.3 ! misho     220: <br><a name="SEC5" href="#TOC1">RETURN VALUES</a><br>
1.1       misho     221: <P>
                    222: The external callout function returns an integer to PCRE. If the value is zero,
                    223: matching proceeds as normal. If the value is greater than zero, matching fails
                    224: at the current point, but the testing of other matching possibilities goes
                    225: ahead, just as if a lookahead assertion had failed. If the value is less than
1.1.1.2   misho     226: zero, the match is abandoned, the matching function returns the negative value.
1.1       misho     227: </P>
                    228: <P>
                    229: Negative values should normally be chosen from the set of PCRE_ERROR_xxx
                    230: values. In particular, PCRE_ERROR_NOMATCH forces a standard "no match" failure.
                    231: The error number PCRE_ERROR_CALLOUT is reserved for use by callout functions;
                    232: it will never be used by PCRE itself.
                    233: </P>
1.1.1.3 ! misho     234: <br><a name="SEC6" href="#TOC1">AUTHOR</a><br>
1.1       misho     235: <P>
                    236: Philip Hazel
                    237: <br>
                    238: University Computing Service
                    239: <br>
                    240: Cambridge CB2 3QH, England.
                    241: <br>
                    242: </P>
1.1.1.3 ! misho     243: <br><a name="SEC7" href="#TOC1">REVISION</a><br>
1.1       misho     244: <P>
1.1.1.3 ! misho     245: Last updated: 03 March 2013
1.1       misho     246: <br>
1.1.1.3 ! misho     247: Copyright &copy; 1997-2013 University of Cambridge.
1.1       misho     248: <br>
                    249: <p>
                    250: Return to the <a href="index.html">PCRE index page</a>.
                    251: </p>

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>