Diff for /embedaddon/pcre/doc/html/pcrebuild.html between versions 1.1.1.1 and 1.1.1.3

version 1.1.1.1, 2012/02/21 23:05:52 version 1.1.1.3, 2012/10/09 09:19:18
Line 14  man page, in case the conversion went wrong. Line 14  man page, in case the conversion went wrong.
 <br>  <br>
 <ul>  <ul>
 <li><a name="TOC1" href="#SEC1">PCRE BUILD-TIME OPTIONS</a>  <li><a name="TOC1" href="#SEC1">PCRE BUILD-TIME OPTIONS</a>
<li><a name="TOC2" href="#SEC2">BUILDING SHARED AND STATIC LIBRARIES</a><li><a name="TOC2" href="#SEC2">BUILDING 8-BIT and 16-BIT LIBRARIES</a>
<li><a name="TOC3" href="#SEC3">C++ SUPPORT</a><li><a name="TOC3" href="#SEC3">BUILDING SHARED AND STATIC LIBRARIES</a>
<li><a name="TOC4" href="#SEC4">UTF-8 SUPPORT</a><li><a name="TOC4" href="#SEC4">C++ SUPPORT</a>
<li><a name="TOC5" href="#SEC5">UNICODE CHARACTER PROPERTY SUPPORT</a><li><a name="TOC5" href="#SEC5">UTF-8 and UTF-16 SUPPORT</a>
<li><a name="TOC6" href="#SEC6">JUST-IN-TIME COMPILER SUPPORT</a><li><a name="TOC6" href="#SEC6">UNICODE CHARACTER PROPERTY SUPPORT</a>
<li><a name="TOC7" href="#SEC7">CODE VALUE OF NEWLINE</a><li><a name="TOC7" href="#SEC7">JUST-IN-TIME COMPILER SUPPORT</a>
<li><a name="TOC8" href="#SEC8">WHAT \R MATCHES</a><li><a name="TOC8" href="#SEC8">CODE VALUE OF NEWLINE</a>
<li><a name="TOC9" href="#SEC9">POSIX MALLOC USAGE</a><li><a name="TOC9" href="#SEC9">WHAT \R MATCHES</a>
<li><a name="TOC10" href="#SEC10">HANDLING VERY LARGE PATTERNS</a><li><a name="TOC10" href="#SEC10">POSIX MALLOC USAGE</a>
<li><a name="TOC11" href="#SEC11">AVOIDING EXCESSIVE STACK USAGE</a><li><a name="TOC11" href="#SEC11">HANDLING VERY LARGE PATTERNS</a>
<li><a name="TOC12" href="#SEC12">LIMITING PCRE RESOURCE USAGE</a><li><a name="TOC12" href="#SEC12">AVOIDING EXCESSIVE STACK USAGE</a>
<li><a name="TOC13" href="#SEC13">CREATING CHARACTER TABLES AT BUILD TIME</a><li><a name="TOC13" href="#SEC13">LIMITING PCRE RESOURCE USAGE</a>
<li><a name="TOC14" href="#SEC14">USING EBCDIC CODE</a><li><a name="TOC14" href="#SEC14">CREATING CHARACTER TABLES AT BUILD TIME</a>
<li><a name="TOC15" href="#SEC15">PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><li><a name="TOC15" href="#SEC15">USING EBCDIC CODE</a>
<li><a name="TOC16" href="#SEC16">PCREGREP BUFFER SIZE</a><li><a name="TOC16" href="#SEC16">PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT</a>
<li><a name="TOC17" href="#SEC17">PCRETEST OPTION FOR LIBREADLINE SUPPORT</a><li><a name="TOC17" href="#SEC17">PCREGREP BUFFER SIZE</a>
<li><a name="TOC18" href="#SEC18">SEE ALSO</a><li><a name="TOC18" href="#SEC18">PCRETEST OPTION FOR LIBREADLINE SUPPORT</a>
<li><a name="TOC19" href="#SEC19">AUTHOR</a><li><a name="TOC19" href="#SEC19">SEE ALSO</a>
<li><a name="TOC20" href="#SEC20">REVISION</a><li><a name="TOC20" href="#SEC20">AUTHOR</a>
 <li><a name="TOC21" href="#SEC21">REVISION</a>
 </ul>  </ul>
 <br><a name="SEC1" href="#TOC1">PCRE BUILD-TIME OPTIONS</a><br>  <br><a name="SEC1" href="#TOC1">PCRE BUILD-TIME OPTIONS</a><br>
 <P>  <P>
Line 63  The following sections include descriptions of options Line 64  The following sections include descriptions of options
 --enable and --disable always come in pairs, so the complementary option always  --enable and --disable always come in pairs, so the complementary option always
 exists as well, but as it specifies the default, it is not described.  exists as well, but as it specifies the default, it is not described.
 </P>  </P>
<br><a name="SEC2" href="#TOC1">BUILDING SHARED AND STATIC LIBRARIES</a><br><br><a name="SEC2" href="#TOC1">BUILDING 8-BIT and 16-BIT LIBRARIES</a><br>
 <P>  <P>
   By default, a library called <b>libpcre</b> is built, containing functions that
   take string arguments contained in vectors of bytes, either as single-byte
   characters, or interpreted as UTF-8 strings. You can also build a separate
   library, called <b>libpcre16</b>, in which strings are contained in vectors of
   16-bit data units and interpreted either as single-unit characters or UTF-16
   strings, by adding
   <pre>
     --enable-pcre16
   </pre>
   to the <b>configure</b> command. If you do not want the 8-bit library, add
   <pre>
     --disable-pcre8
   </pre>
   as well. At least one of the two libraries must be built. Note that the C++ and
   POSIX wrappers are for the 8-bit library only, and that <b>pcregrep</b> is an
   8-bit program. None of these are built if you select only the 16-bit library.
   </P>
   <br><a name="SEC3" href="#TOC1">BUILDING SHARED AND STATIC LIBRARIES</a><br>
   <P>
 The PCRE building process uses <b>libtool</b> to build both shared and static  The PCRE building process uses <b>libtool</b> to build both shared and static
 Unix libraries by default. You can suppress one of these by adding one of  Unix libraries by default. You can suppress one of these by adding one of
 <pre>  <pre>
Line 73  Unix libraries by default. You can suppress one of the Line 93  Unix libraries by default. You can suppress one of the
 </pre>  </pre>
 to the <b>configure</b> command, as required.  to the <b>configure</b> command, as required.
 </P>  </P>
<br><a name="SEC3" href="#TOC1">C++ SUPPORT</a><br><br><a name="SEC4" href="#TOC1">C++ SUPPORT</a><br>
 <P>  <P>
By default, the <b>configure</b> script will search for a C++ compiler and C++By default, if the 8-bit library is being built, the <b>configure</b> script
header files. If it finds them, it automatically builds the C++ wrapper librarywill search for a C++ compiler and C++ header files. If it finds them, it
for PCRE. You can disable this by addingautomatically builds the C++ wrapper library (which supports only 8-bit
 strings). You can disable this by adding
 <pre>  <pre>
   --disable-cpp    --disable-cpp
 </pre>  </pre>
 to the <b>configure</b> command.  to the <b>configure</b> command.
 </P>  </P>
<br><a name="SEC4" href="#TOC1">UTF-8 SUPPORT</a><br><br><a name="SEC5" href="#TOC1">UTF-8 and UTF-16 SUPPORT</a><br>
 <P>  <P>
To build PCRE with support for UTF-8 Unicode character strings, addTo build PCRE with support for UTF Unicode character strings, add
 <pre>  <pre>
  --enable-utf8  --enable-utf
 </pre>  </pre>
to the <b>configure</b> command. Of itself, this does not make PCRE treatto the <b>configure</b> command. This setting applies to both libraries, adding
strings as UTF-8. As well as compiling PCRE with this option, you also havesupport for UTF-8 to the 8-bit library and support for UTF-16 to the 16-bit
have to set the PCRE_UTF8 option when you call the <b>pcre_compile()</b>library. There are no separate options for enabling UTF-8 and UTF-16
or <b>pcre_compile2()</b> functions.independently because that would allow ridiculous settings such as requesting
 UTF-16 support while building only the 8-bit library. It is not possible to
 build one library with UTF support and the other without in the same
 configuration. (For backwards compatibility, --enable-utf8 is a synonym of
 --enable-utf.)
 </P>  </P>
 <P>  <P>
If you set --enable-utf8 when compiling in an EBCDIC environment, PCRE expectsOf itself, this setting does not make PCRE treat strings as UTF-8 or UTF-16. As
its input to be either ASCII or UTF-8 (depending on the runtime option). It iswell as compiling PCRE with this option, you also have have to set the
 PCRE_UTF8 or PCRE_UTF16 option when you call one of the pattern compiling
 functions.
 </P>
 <P>
 If you set --enable-utf when compiling in an EBCDIC environment, PCRE expects
 its input to be either ASCII or UTF-8 (depending on the run-time option). It is
 not possible to support both EBCDIC and UTF-8 codes in the same version of the  not possible to support both EBCDIC and UTF-8 codes in the same version of the
library. Consequently, --enable-utf8 and --enable-ebcdic are mutuallylibrary. Consequently, --enable-utf and --enable-ebcdic are mutually
 exclusive.  exclusive.
 </P>  </P>
<br><a name="SEC5" href="#TOC1">UNICODE CHARACTER PROPERTY SUPPORT</a><br><br><a name="SEC6" href="#TOC1">UNICODE CHARACTER PROPERTY SUPPORT</a><br>
 <P>  <P>
UTF-8 support allows PCRE to process character values greater than 255 in theUTF support allows the libraries to process character codepoints up to 0x10ffff
strings that it handles. On its own, however, it does not provide anyin the strings that they handle. On its own, however, it does not provide any
 facilities for accessing the properties of such characters. If you want to be  facilities for accessing the properties of such characters. If you want to be
 able to use the pattern escapes \P, \p, and \X, which refer to Unicode  able to use the pattern escapes \P, \p, and \X, which refer to Unicode
 character properties, you must add  character properties, you must add
 <pre>  <pre>
   --enable-unicode-properties    --enable-unicode-properties
 </pre>  </pre>
to the <b>configure</b> command. This implies UTF-8 support, even if you haveto the <b>configure</b> command. This implies UTF support, even if you have
 not explicitly requested it.  not explicitly requested it.
 </P>  </P>
 <P>  <P>
Line 121  supported. Details are given in the Line 152  supported. Details are given in the
 <a href="pcrepattern.html"><b>pcrepattern</b></a>  <a href="pcrepattern.html"><b>pcrepattern</b></a>
 documentation.  documentation.
 </P>  </P>
<br><a name="SEC6" href="#TOC1">JUST-IN-TIME COMPILER SUPPORT</a><br><br><a name="SEC7" href="#TOC1">JUST-IN-TIME COMPILER SUPPORT</a><br>
 <P>  <P>
 Just-in-time compiler support is included in the build by specifying  Just-in-time compiler support is included in the build by specifying
 <pre>  <pre>
Line 138  pcregrep automatically makes use of it, unless you add Line 169  pcregrep automatically makes use of it, unless you add
 </pre>  </pre>
 to the "configure" command.  to the "configure" command.
 </P>  </P>
<br><a name="SEC7" href="#TOC1">CODE VALUE OF NEWLINE</a><br><br><a name="SEC8" href="#TOC1">CODE VALUE OF NEWLINE</a><br>
 <P>  <P>
 By default, PCRE interprets the linefeed (LF) character as indicating the end  By default, PCRE interprets the linefeed (LF) character as indicating the end
 of a line. This is the normal newline character on Unix-like systems. You can  of a line. This is the normal newline character on Unix-like systems. You can
Line 171  Whatever line ending convention is selected when PCRE  Line 202  Whatever line ending convention is selected when PCRE 
 overridden when the library functions are called. At build time it is  overridden when the library functions are called. At build time it is
 conventional to use the standard for your operating system.  conventional to use the standard for your operating system.
 </P>  </P>
<br><a name="SEC8" href="#TOC1">WHAT \R MATCHES</a><br><br><a name="SEC9" href="#TOC1">WHAT \R MATCHES</a><br>
 <P>  <P>
 By default, the sequence \R in a pattern matches any Unicode newline sequence,  By default, the sequence \R in a pattern matches any Unicode newline sequence,
 whatever has been selected as the line ending sequence. If you specify  whatever has been selected as the line ending sequence. If you specify
Line 182  the default is changed so that \R matches only CR, LF, Line 213  the default is changed so that \R matches only CR, LF,
 selected when PCRE is built can be overridden when the library functions are  selected when PCRE is built can be overridden when the library functions are
 called.  called.
 </P>  </P>
<br><a name="SEC9" href="#TOC1">POSIX MALLOC USAGE</a><br><br><a name="SEC10" href="#TOC1">POSIX MALLOC USAGE</a><br>
 <P>  <P>
When PCRE is called through the POSIX interface (see theWhen the 8-bit library is called through the POSIX interface (see the
 <a href="pcreposix.html"><b>pcreposix</b></a>  <a href="pcreposix.html"><b>pcreposix</b></a>
 documentation), additional working storage is required for holding the pointers  documentation), additional working storage is required for holding the pointers
 to capturing substrings, because PCRE requires three integers per substring,  to capturing substrings, because PCRE requires three integers per substring,
Line 198  such as Line 229  such as
 </pre>  </pre>
 to the <b>configure</b> command.  to the <b>configure</b> command.
 </P>  </P>
<br><a name="SEC10" href="#TOC1">HANDLING VERY LARGE PATTERNS</a><br><br><a name="SEC11" href="#TOC1">HANDLING VERY LARGE PATTERNS</a><br>
 <P>  <P>
 Within a compiled pattern, offset values are used to point from one part to  Within a compiled pattern, offset values are used to point from one part to
 another (for example, from an opening parenthesis to an alternation  another (for example, from an opening parenthesis to an alternation
 metacharacter). By default, two-byte values are used for these offsets, leading  metacharacter). By default, two-byte values are used for these offsets, leading
 to a maximum size for a compiled pattern of around 64K. This is sufficient to  to a maximum size for a compiled pattern of around 64K. This is sufficient to
 handle all but the most gigantic patterns. Nevertheless, some people do want to  handle all but the most gigantic patterns. Nevertheless, some people do want to
process truyl enormous patterns, so it is possible to compile PCRE to useprocess truly enormous patterns, so it is possible to compile PCRE to use
 three-byte or four-byte offsets by adding a setting such as  three-byte or four-byte offsets by adding a setting such as
 <pre>  <pre>
   --with-link-size=3    --with-link-size=3
 </pre>  </pre>
to the <b>configure</b> command. The value given must be 2, 3, or 4. Usingto the <b>configure</b> command. The value given must be 2, 3, or 4. For the
longer offsets slows down the operation of PCRE because it has to load16-bit library, a value of 3 is rounded up to 4. Using longer offsets slows
additional bytes when handling them.down the operation of PCRE because it has to load additional data when handling
 them.
 </P>  </P>
<br><a name="SEC11" href="#TOC1">AVOIDING EXCESSIVE STACK USAGE</a><br><br><a name="SEC12" href="#TOC1">AVOIDING EXCESSIVE STACK USAGE</a><br>
 <P>  <P>
 When matching with the <b>pcre_exec()</b> function, PCRE implements backtracking  When matching with the <b>pcre_exec()</b> function, PCRE implements backtracking
 by making recursive calls to an internal function called <b>match()</b>. In  by making recursive calls to an internal function called <b>match()</b>. In
Line 245  perform better than <b>malloc()</b> and <b>free()</b>. Line 277  perform better than <b>malloc()</b> and <b>free()</b>.
 slowly when built in this way. This option affects only the <b>pcre_exec()</b>  slowly when built in this way. This option affects only the <b>pcre_exec()</b>
 function; it is not relevant for <b>pcre_dfa_exec()</b>.  function; it is not relevant for <b>pcre_dfa_exec()</b>.
 </P>  </P>
<br><a name="SEC12" href="#TOC1">LIMITING PCRE RESOURCE USAGE</a><br><br><a name="SEC13" href="#TOC1">LIMITING PCRE RESOURCE USAGE</a><br>
 <P>  <P>
 Internally, PCRE has a function called <b>match()</b>, which it calls repeatedly  Internally, PCRE has a function called <b>match()</b>, which it calls repeatedly
 (sometimes recursively) when matching a pattern with the <b>pcre_exec()</b>  (sometimes recursively) when matching a pattern with the <b>pcre_exec()</b>
Line 274  constraints. However, you can set a lower limit by add Line 306  constraints. However, you can set a lower limit by add
 </pre>  </pre>
 to the <b>configure</b> command. This value can also be overridden at run time.  to the <b>configure</b> command. This value can also be overridden at run time.
 </P>  </P>
<br><a name="SEC13" href="#TOC1">CREATING CHARACTER TABLES AT BUILD TIME</a><br><br><a name="SEC14" href="#TOC1">CREATING CHARACTER TABLES AT BUILD TIME</a><br>
 <P>  <P>
 PCRE uses fixed tables for processing characters whose code values are less  PCRE uses fixed tables for processing characters whose code values are less
 than 256. By default, PCRE is built with a set of tables that are distributed  than 256. By default, PCRE is built with a set of tables that are distributed
Line 285  only. If you add Line 317  only. If you add
 </pre>  </pre>
 to the <b>configure</b> command, the distributed tables are no longer used.  to the <b>configure</b> command, the distributed tables are no longer used.
 Instead, a program called <b>dftables</b> is compiled and run. This outputs the  Instead, a program called <b>dftables</b> is compiled and run. This outputs the
source for new set of tables, created in the default locale of your C runtimesource for new set of tables, created in the default locale of your C run-time
 system. (This method of replacing the tables does not work if you are cross  system. (This method of replacing the tables does not work if you are cross
 compiling, because <b>dftables</b> is run on the local host. If you need to  compiling, because <b>dftables</b> is run on the local host. If you need to
 create alternative tables when cross compiling, you will have to do so "by  create alternative tables when cross compiling, you will have to do so "by
 hand".)  hand".)
 </P>  </P>
<br><a name="SEC14" href="#TOC1">USING EBCDIC CODE</a><br><br><a name="SEC15" href="#TOC1">USING EBCDIC CODE</a><br>
 <P>  <P>
 PCRE assumes by default that it will run in an environment where the character  PCRE assumes by default that it will run in an environment where the character
 code is ASCII (or Unicode, which is a superset of ASCII). This is the case for  code is ASCII (or Unicode, which is a superset of ASCII). This is the case for
Line 303  EBCDIC environment by adding Line 335  EBCDIC environment by adding
 to the <b>configure</b> command. This setting implies  to the <b>configure</b> command. This setting implies
 --enable-rebuild-chartables. You should only use it if you know that you are in  --enable-rebuild-chartables. You should only use it if you know that you are in
 an EBCDIC environment (for example, an IBM mainframe operating system). The  an EBCDIC environment (for example, an IBM mainframe operating system). The
--enable-ebcdic option is incompatible with --enable-utf8.--enable-ebcdic option is incompatible with --enable-utf.
 </P>  </P>
<br><a name="SEC15" href="#TOC1">PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br><br><a name="SEC16" href="#TOC1">PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT</a><br>
 <P>  <P>
 By default, <b>pcregrep</b> reads all files as plain text. You can build it so  By default, <b>pcregrep</b> reads all files as plain text. You can build it so
 that it recognizes files whose names end in <b>.gz</b> or <b>.bz2</b>, and reads  that it recognizes files whose names end in <b>.gz</b> or <b>.bz2</b>, and reads
Line 318  to the <b>configure</b> command. These options natural Line 350  to the <b>configure</b> command. These options natural
 relevant libraries are installed on your system. Configuration will fail if  relevant libraries are installed on your system. Configuration will fail if
 they are not.  they are not.
 </P>  </P>
<br><a name="SEC16" href="#TOC1">PCREGREP BUFFER SIZE</a><br><br><a name="SEC17" href="#TOC1">PCREGREP BUFFER SIZE</a><br>
 <P>  <P>
 <b>pcregrep</b> uses an internal buffer to hold a "window" on the file it is  <b>pcregrep</b> uses an internal buffer to hold a "window" on the file it is
 scanning, in order to be able to output "before" and "after" lines when it  scanning, in order to be able to output "before" and "after" lines when it
Line 333  parameter value by adding, for example, Line 365  parameter value by adding, for example,
 to the <b>configure</b> command. The caller of \fPpcregrep\fP can, however,  to the <b>configure</b> command. The caller of \fPpcregrep\fP can, however,
 override this value by specifying a run-time option.  override this value by specifying a run-time option.
 </P>  </P>
<br><a name="SEC17" href="#TOC1">PCRETEST OPTION FOR LIBREADLINE SUPPORT</a><br><br><a name="SEC18" href="#TOC1">PCRETEST OPTION FOR LIBREADLINE SUPPORT</a><br>
 <P>  <P>
 If you add  If you add
 <pre>  <pre>
Line 364  automatically included, you may need to add something  Line 396  automatically included, you may need to add something 
 </pre>  </pre>
 immediately before the <b>configure</b> command.  immediately before the <b>configure</b> command.
 </P>  </P>
<br><a name="SEC18" href="#TOC1">SEE ALSO</a><br><br><a name="SEC19" href="#TOC1">SEE ALSO</a><br>
 <P>  <P>
<b>pcreapi</b>(3), <b>pcre_config</b>(3).<b>pcreapi</b>(3), <b>pcre16</b>, <b>pcre_config</b>(3).
 </P>  </P>
<br><a name="SEC19" href="#TOC1">AUTHOR</a><br><br><a name="SEC20" href="#TOC1">AUTHOR</a><br>
 <P>  <P>
 Philip Hazel  Philip Hazel
 <br>  <br>
Line 377  University Computing Service Line 409  University Computing Service
 Cambridge CB2 3QH, England.  Cambridge CB2 3QH, England.
 <br>  <br>
 </P>  </P>
<br><a name="SEC20" href="#TOC1">REVISION</a><br><br><a name="SEC21" href="#TOC1">REVISION</a><br>
 <P>  <P>
Last updated: 06 September 2011Last updated: 07 January 2012
 <br>  <br>
Copyright &copy; 1997-2011 University of Cambridge.Copyright &copy; 1997-2012 University of Cambridge.
 <br>  <br>
 <p>  <p>
 Return to the <a href="index.html">PCRE index page</a>.  Return to the <a href="index.html">PCRE index page</a>.

Removed from v.1.1.1.1  
changed lines
  Added in v.1.1.1.3


FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>