File:  [ELWIX - Embedded LightWeight unIX -] / embedaddon / pcre / doc / html / pcreprecompile.html
Revision 1.1.1.4 (vendor branch): download - view: text, annotated - select for diffs - revision graph
Sun Jun 15 19:46:05 2014 UTC (10 years, 9 months ago) by misho
Branches: pcre, MAIN
CVS tags: v8_34, HEAD
pcre 8.34

    1: <html>
    2: <head>
    3: <title>pcreprecompile specification</title>
    4: </head>
    5: <body bgcolor="#FFFFFF" text="#00005A" link="#0066FF" alink="#3399FF" vlink="#2222BB">
    6: <h1>pcreprecompile man page</h1>
    7: <p>
    8: Return to the <a href="index.html">PCRE index page</a>.
    9: </p>
   10: <p>
   11: This page is part of the PCRE HTML documentation. It was generated automatically
   12: from the original man page. If there is any nonsense in it, please consult the
   13: man page, in case the conversion went wrong.
   14: <br>
   15: <ul>
   16: <li><a name="TOC1" href="#SEC1">SAVING AND RE-USING PRECOMPILED PCRE PATTERNS</a>
   17: <li><a name="TOC2" href="#SEC2">SAVING A COMPILED PATTERN</a>
   18: <li><a name="TOC3" href="#SEC3">RE-USING A PRECOMPILED PATTERN</a>
   19: <li><a name="TOC4" href="#SEC4">COMPATIBILITY WITH DIFFERENT PCRE RELEASES</a>
   20: <li><a name="TOC5" href="#SEC5">AUTHOR</a>
   21: <li><a name="TOC6" href="#SEC6">REVISION</a>
   22: </ul>
   23: <br><a name="SEC1" href="#TOC1">SAVING AND RE-USING PRECOMPILED PCRE PATTERNS</a><br>
   24: <P>
   25: If you are running an application that uses a large number of regular
   26: expression patterns, it may be useful to store them in a precompiled form
   27: instead of having to compile them every time the application is run.
   28: If you are not using any private character tables (see the
   29: <a href="pcre_maketables.html"><b>pcre_maketables()</b></a>
   30: documentation), this is relatively straightforward. If you are using private
   31: tables, it is a little bit more complicated. However, if you are using the
   32: just-in-time optimization feature, it is not possible to save and reload the
   33: JIT data.
   34: </P>
   35: <P>
   36: If you save compiled patterns to a file, you can copy them to a different host
   37: and run them there. If the two hosts have different endianness (byte order),
   38: you should run the <b>pcre[16|32]_pattern_to_host_byte_order()</b> function on the
   39: new host before trying to match the pattern. The matching functions return
   40: PCRE_ERROR_BADENDIANNESS if they detect a pattern with the wrong endianness.
   41: </P>
   42: <P>
   43: Compiling regular expressions with one version of PCRE for use with a different
   44: version is not guaranteed to work and may cause crashes, and saving and
   45: restoring a compiled pattern loses any JIT optimization data.
   46: </P>
   47: <br><a name="SEC2" href="#TOC1">SAVING A COMPILED PATTERN</a><br>
   48: <P>
   49: The value returned by <b>pcre[16|32]_compile()</b> points to a single block of
   50: memory that holds the compiled pattern and associated data. You can find the
   51: length of this block in bytes by calling <b>pcre[16|32]_fullinfo()</b> with an
   52: argument of PCRE_INFO_SIZE. You can then save the data in any appropriate
   53: manner. Here is sample code for the 8-bit library that compiles a pattern and
   54: writes it to a file. It assumes that the variable <i>fd</i> refers to a file
   55: that is open for output:
   56: <pre>
   57:   int erroroffset, rc, size;
   58:   char *error;
   59:   pcre *re;
   60: 
   61:   re = pcre_compile("my pattern", 0, &error, &erroroffset, NULL);
   62:   if (re == NULL) { ... handle errors ... }
   63:   rc = pcre_fullinfo(re, NULL, PCRE_INFO_SIZE, &size);
   64:   if (rc &#60; 0) { ... handle errors ... }
   65:   rc = fwrite(re, 1, size, fd);
   66:   if (rc != size) { ... handle errors ... }
   67: </pre>
   68: In this example, the bytes that comprise the compiled pattern are copied
   69: exactly. Note that this is binary data that may contain any of the 256 possible
   70: byte values. On systems that make a distinction between binary and non-binary
   71: data, be sure that the file is opened for binary output.
   72: </P>
   73: <P>
   74: If you want to write more than one pattern to a file, you will have to devise a
   75: way of separating them. For binary data, preceding each pattern with its length
   76: is probably the most straightforward approach. Another possibility is to write
   77: out the data in hexadecimal instead of binary, one pattern to a line.
   78: </P>
   79: <P>
   80: Saving compiled patterns in a file is only one possible way of storing them for
   81: later use. They could equally well be saved in a database, or in the memory of
   82: some daemon process that passes them via sockets to the processes that want
   83: them.
   84: </P>
   85: <P>
   86: If the pattern has been studied, it is also possible to save the normal study
   87: data in a similar way to the compiled pattern itself. However, if the
   88: PCRE_STUDY_JIT_COMPILE was used, the just-in-time data that is created cannot
   89: be saved because it is too dependent on the current environment. When studying
   90: generates additional information, <b>pcre[16|32]_study()</b> returns a pointer to a
   91: <b>pcre[16|32]_extra</b> data block. Its format is defined in the
   92: <a href="pcreapi.html#extradata">section on matching a pattern</a>
   93: in the
   94: <a href="pcreapi.html"><b>pcreapi</b></a>
   95: documentation. The <i>study_data</i> field points to the binary study data, and
   96: this is what you must save (not the <b>pcre[16|32]_extra</b> block itself). The
   97: length of the study data can be obtained by calling <b>pcre[16|32]_fullinfo()</b>
   98: with an argument of PCRE_INFO_STUDYSIZE. Remember to check that
   99: <b>pcre[16|32]_study()</b> did return a non-NULL value before trying to save the
  100: study data.
  101: </P>
  102: <br><a name="SEC3" href="#TOC1">RE-USING A PRECOMPILED PATTERN</a><br>
  103: <P>
  104: Re-using a precompiled pattern is straightforward. Having reloaded it into main
  105: memory, called <b>pcre[16|32]_pattern_to_host_byte_order()</b> if necessary, you
  106: pass its pointer to <b>pcre[16|32]_exec()</b> or <b>pcre[16|32]_dfa_exec()</b> in
  107: the usual way.
  108: </P>
  109: <P>
  110: However, if you passed a pointer to custom character tables when the pattern
  111: was compiled (the <i>tableptr</i> argument of <b>pcre[16|32]_compile()</b>), you
  112: must now pass a similar pointer to <b>pcre[16|32]_exec()</b> or
  113: <b>pcre[16|32]_dfa_exec()</b>, because the value saved with the compiled pattern
  114: will obviously be nonsense. A field in a <b>pcre[16|32]_extra()</b> block is used
  115: to pass this data, as described in the
  116: <a href="pcreapi.html#extradata">section on matching a pattern</a>
  117: in the
  118: <a href="pcreapi.html"><b>pcreapi</b></a>
  119: documentation.
  120: </P>
  121: <P>
  122: <b>Warning:</b> The tables that <b>pcre_exec()</b> and <b>pcre_dfa_exec()</b> use
  123: must be the same as those that were used when the pattern was compiled. If this
  124: is not the case, the behaviour is undefined.
  125: </P>
  126: <P>
  127: If you did not provide custom character tables when the pattern was compiled,
  128: the pointer in the compiled pattern is NULL, which causes the matching
  129: functions to use PCRE's internal tables. Thus, you do not need to take any
  130: special action at run time in this case.
  131: </P>
  132: <P>
  133: If you saved study data with the compiled pattern, you need to create your own
  134: <b>pcre[16|32]_extra</b> data block and set the <i>study_data</i> field to point
  135: to the reloaded study data. You must also set the PCRE_EXTRA_STUDY_DATA bit in
  136: the <i>flags</i> field to indicate that study data is present. Then pass the
  137: <b>pcre[16|32]_extra</b> block to the matching function in the usual way. If the
  138: pattern was studied for just-in-time optimization, that data cannot be saved,
  139: and so is lost by a save/restore cycle.
  140: </P>
  141: <br><a name="SEC4" href="#TOC1">COMPATIBILITY WITH DIFFERENT PCRE RELEASES</a><br>
  142: <P>
  143: In general, it is safest to recompile all saved patterns when you update to a
  144: new PCRE release, though not all updates actually require this.
  145: </P>
  146: <br><a name="SEC5" href="#TOC1">AUTHOR</a><br>
  147: <P>
  148: Philip Hazel
  149: <br>
  150: University Computing Service
  151: <br>
  152: Cambridge CB2 3QH, England.
  153: <br>
  154: </P>
  155: <br><a name="SEC6" href="#TOC1">REVISION</a><br>
  156: <P>
  157: Last updated: 12 November 2013
  158: <br>
  159: Copyright &copy; 1997-2013 University of Cambridge.
  160: <br>
  161: <p>
  162: Return to the <a href="index.html">PCRE index page</a>.
  163: </p>

FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>