version 1.1.1.2, 2012/02/21 23:50:25
|
version 1.1.1.4, 2013/07/22 08:25:56
|
Line 1
|
Line 1
|
.TH PCREBUILD 3 | .TH PCREBUILD 3 "12 May 2013" "PCRE 8.33" |
.SH NAME |
.SH NAME |
PCRE - Perl-compatible regular expressions |
PCRE - Perl-compatible regular expressions |
. |
. |
. |
. |
|
.SH "BUILDING PCRE" |
|
.rs |
|
.sp |
|
PCRE is distributed with a \fBconfigure\fP script that can be used to build the |
|
library in Unix-like environments using the applications known as Autotools. |
|
Also in the distribution are files to support building using \fBCMake\fP |
|
instead of \fBconfigure\fP. The text file |
|
.\" HTML <a href="README.txt"> |
|
.\" </a> |
|
\fBREADME\fP |
|
.\" |
|
contains general information about building with Autotools (some of which is |
|
repeated below), and also has some comments about building on various operating |
|
systems. There is a lot more information about building PCRE without using |
|
Autotools (including information about using \fBCMake\fP and building "by |
|
hand") in the text file called |
|
.\" HTML <a href="NON-AUTOTOOLS-BUILD.txt"> |
|
.\" </a> |
|
\fBNON-AUTOTOOLS-BUILD\fP. |
|
.\" |
|
You should consult this file as well as the |
|
.\" HTML <a href="README.txt"> |
|
.\" </a> |
|
\fBREADME\fP |
|
.\" |
|
file if you are building in a non-Unix-like environment. |
|
. |
|
. |
.SH "PCRE BUILD-TIME OPTIONS" |
.SH "PCRE BUILD-TIME OPTIONS" |
.rs |
.rs |
.sp |
.sp |
This document describes the optional features of PCRE that can be selected when | The rest of this document describes the optional features of PCRE that can be |
the library is compiled. It assumes use of the \fBconfigure\fP script, where | selected when the library is compiled. It assumes use of the \fBconfigure\fP |
the optional features are selected or deselected by providing options to | script, where the optional features are selected or deselected by providing |
\fBconfigure\fP before running the \fBmake\fP command. However, the same | options to \fBconfigure\fP before running the \fBmake\fP command. However, the |
options can be selected in both Unix-like and non-Unix-like environments using | same options can be selected in both Unix-like and non-Unix-like environments |
the GUI facility of \fBcmake-gui\fP if you are using \fBCMake\fP instead of | using the GUI facility of \fBcmake-gui\fP if you are using \fBCMake\fP instead |
\fBconfigure\fP to build PCRE. | of \fBconfigure\fP to build PCRE. |
.P |
.P |
There is a lot more information about building PCRE in non-Unix-like | If you are not using Autotools or \fBCMake\fP, option selection can be done by |
environments in the file called \fINON_UNIX_USE\fP, which is part of the PCRE | editing the \fBconfig.h\fP file, or by passing parameter settings to the |
distribution. You should consult this file as well as the \fIREADME\fP file if | compiler, as described in |
you are building in a non-Unix-like environment. | .\" HTML <a href="NON-AUTOTOOLS-BUILD.txt"> |
| .\" </a> |
| \fBNON-AUTOTOOLS-BUILD\fP. |
| .\" |
.P |
.P |
The complete list of options for \fBconfigure\fP (which includes the standard |
The complete list of options for \fBconfigure\fP (which includes the standard |
ones such as the selection of the installation directory) can be obtained by |
ones such as the selection of the installation directory) can be obtained by |
Line 32 The following sections include descriptions of options
|
Line 63 The following sections include descriptions of options
|
exists as well, but as it specifies the default, it is not described. |
exists as well, but as it specifies the default, it is not described. |
. |
. |
. |
. |
.SH "BUILDING 8-BIT and 16-BIT LIBRARIES" | .SH "BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES" |
.rs |
.rs |
.sp |
.sp |
By default, a library called \fBlibpcre\fP is built, containing functions that |
By default, a library called \fBlibpcre\fP is built, containing functions that |
Line 44 strings, by adding
|
Line 75 strings, by adding
|
.sp |
.sp |
--enable-pcre16 |
--enable-pcre16 |
.sp |
.sp |
|
to the \fBconfigure\fP command. You can also build yet another separate |
|
library, called \fBlibpcre32\fP, in which strings are contained in vectors of |
|
32-bit data units and interpreted either as single-unit characters or UTF-32 |
|
strings, by adding |
|
.sp |
|
--enable-pcre32 |
|
.sp |
to the \fBconfigure\fP command. If you do not want the 8-bit library, add |
to the \fBconfigure\fP command. If you do not want the 8-bit library, add |
.sp |
.sp |
--disable-pcre8 |
--disable-pcre8 |
.sp |
.sp |
as well. At least one of the two libraries must be built. Note that the C++ and | as well. At least one of the three libraries must be built. Note that the C++ |
POSIX wrappers are for the 8-bit library only, and that \fBpcregrep\fP is an | and POSIX wrappers are for the 8-bit library only, and that \fBpcregrep\fP is |
8-bit program. None of these are built if you select only the 16-bit library. | an 8-bit program. None of these are built if you select only the 16-bit or |
| 32-bit libraries. |
. |
. |
. |
. |
.SH "BUILDING SHARED AND STATIC LIBRARIES" |
.SH "BUILDING SHARED AND STATIC LIBRARIES" |
.rs |
.rs |
.sp |
.sp |
The PCRE building process uses \fBlibtool\fP to build both shared and static | The Autotools PCRE building process uses \fBlibtool\fP to build both shared and |
Unix libraries by default. You can suppress one of these by adding one of | static libraries by default. You can suppress one of these by adding one of |
.sp |
.sp |
--disable-shared |
--disable-shared |
--disable-static |
--disable-static |
Line 78 strings). You can disable this by adding
|
Line 117 strings). You can disable this by adding
|
to the \fBconfigure\fP command. |
to the \fBconfigure\fP command. |
. |
. |
. |
. |
.SH "UTF-8 and UTF-16 SUPPORT" | .SH "UTF-8, UTF-16 AND UTF-32 SUPPORT" |
.rs |
.rs |
.sp |
.sp |
To build PCRE with support for UTF Unicode character strings, add |
To build PCRE with support for UTF Unicode character strings, add |
.sp |
.sp |
--enable-utf |
--enable-utf |
.sp |
.sp |
to the \fBconfigure\fP command. This setting applies to both libraries, adding | to the \fBconfigure\fP command. This setting applies to all three libraries, |
support for UTF-8 to the 8-bit library and support for UTF-16 to the 16-bit | adding support for UTF-8 to the 8-bit library, support for UTF-16 to the 16-bit |
library. There are no separate options for enabling UTF-8 and UTF-16 | library, and support for UTF-32 to the to the 32-bit library. There are no |
independently because that would allow ridiculous settings such as requesting | separate options for enabling UTF-8, UTF-16 and UTF-32 independently because |
UTF-16 support while building only the 8-bit library. It is not possible to | that would allow ridiculous settings such as requesting UTF-16 support while |
build one library with UTF support and the other without in the same | building only the 8-bit library. It is not possible to build one library with |
configuration. (For backwards compatibility, --enable-utf8 is a synonym of | UTF support and another without in the same configuration. (For backwards |
--enable-utf.) | compatibility, --enable-utf8 is a synonym of --enable-utf.) |
.P |
.P |
Of itself, this setting does not make PCRE treat strings as UTF-8 or UTF-16. As | Of itself, this setting does not make PCRE treat strings as UTF-8, UTF-16 or |
well as compiling PCRE with this option, you also have have to set the | UTF-32. As well as compiling PCRE with this option, you also have have to set |
PCRE_UTF8 or PCRE_UTF16 option when you call one of the pattern compiling | the PCRE_UTF8, PCRE_UTF16 or PCRE_UTF32 option (as appropriate) when you call |
functions. | one of the pattern compiling functions. |
.P |
.P |
If you set --enable-utf when compiling in an EBCDIC environment, PCRE expects |
If you set --enable-utf when compiling in an EBCDIC environment, PCRE expects |
its input to be either ASCII or UTF-8 (depending on the runtime option). It is | its input to be either ASCII or UTF-8 (depending on the run-time option). It is |
not possible to support both EBCDIC and UTF-8 codes in the same version of the |
not possible to support both EBCDIC and UTF-8 codes in the same version of the |
library. Consequently, --enable-utf and --enable-ebcdic are mutually |
library. Consequently, --enable-utf and --enable-ebcdic are mutually |
exclusive. |
exclusive. |
Line 221 to the \fBconfigure\fP command.
|
Line 260 to the \fBconfigure\fP command.
|
.sp |
.sp |
Within a compiled pattern, offset values are used to point from one part to |
Within a compiled pattern, offset values are used to point from one part to |
another (for example, from an opening parenthesis to an alternation |
another (for example, from an opening parenthesis to an alternation |
metacharacter). By default, two-byte values are used for these offsets, leading | metacharacter). By default, in the 8-bit and 16-bit libraries, two-byte values |
to a maximum size for a compiled pattern of around 64K. This is sufficient to | are used for these offsets, leading to a maximum size for a compiled pattern of |
handle all but the most gigantic patterns. Nevertheless, some people do want to | around 64K. This is sufficient to handle all but the most gigantic patterns. |
process truly enormous patterns, so it is possible to compile PCRE to use | Nevertheless, some people do want to process truly enormous patterns, so it is |
three-byte or four-byte offsets by adding a setting such as | possible to compile PCRE to use three-byte or four-byte offsets by adding a |
| setting such as |
.sp |
.sp |
--with-link-size=3 |
--with-link-size=3 |
.sp |
.sp |
to the \fBconfigure\fP command. The value given must be 2, 3, or 4. For the |
to the \fBconfigure\fP command. The value given must be 2, 3, or 4. For the |
16-bit library, a value of 3 is rounded up to 4. Using longer offsets slows | 16-bit library, a value of 3 is rounded up to 4. In these libraries, using |
down the operation of PCRE because it has to load additional data when handling | longer offsets slows down the operation of PCRE because it has to load |
them. | additional data when handling them. For the 32-bit library the value is always |
| 4 and cannot be overridden; the value of --with-link-size is ignored. |
. |
. |
. |
. |
.SH "AVOIDING EXCESSIVE STACK USAGE" |
.SH "AVOIDING EXCESSIVE STACK USAGE" |
Line 313 only. If you add
|
Line 354 only. If you add
|
.sp |
.sp |
to the \fBconfigure\fP command, the distributed tables are no longer used. |
to the \fBconfigure\fP command, the distributed tables are no longer used. |
Instead, a program called \fBdftables\fP is compiled and run. This outputs the |
Instead, a program called \fBdftables\fP is compiled and run. This outputs the |
source for new set of tables, created in the default locale of your C runtime | source for new set of tables, created in the default locale of your C run-time |
system. (This method of replacing the tables does not work if you are cross |
system. (This method of replacing the tables does not work if you are cross |
compiling, because \fBdftables\fP is run on the local host. If you need to |
compiling, because \fBdftables\fP is run on the local host. If you need to |
create alternative tables when cross compiling, you will have to do so "by |
create alternative tables when cross compiling, you will have to do so "by |
Line 334 to the \fBconfigure\fP command. This setting implies
|
Line 375 to the \fBconfigure\fP command. This setting implies
|
--enable-rebuild-chartables. You should only use it if you know that you are in |
--enable-rebuild-chartables. You should only use it if you know that you are in |
an EBCDIC environment (for example, an IBM mainframe operating system). The |
an EBCDIC environment (for example, an IBM mainframe operating system). The |
--enable-ebcdic option is incompatible with --enable-utf. |
--enable-ebcdic option is incompatible with --enable-utf. |
|
.P |
|
The EBCDIC character that corresponds to an ASCII LF is assumed to have the |
|
value 0x15 by default. However, in some EBCDIC environments, 0x25 is used. In |
|
such an environment you should use |
|
.sp |
|
--enable-ebcdic-nl25 |
|
.sp |
|
as well as, or instead of, --enable-ebcdic. The EBCDIC character for CR has the |
|
same value as in ASCII, namely, 0x0d. Whichever of 0x15 and 0x25 is \fInot\fP |
|
chosen as LF is made to correspond to the Unicode NEL character (which, in |
|
Unicode, is 0x85). |
|
.P |
|
The options that select newline behaviour, such as --enable-newline-is-cr, |
|
and equivalent run-time options, refer to these character values in an EBCDIC |
|
environment. |
. |
. |
. |
. |
.SH "PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT" |
.SH "PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT" |
Line 400 automatically included, you may need to add something
|
Line 456 automatically included, you may need to add something
|
immediately before the \fBconfigure\fP command. |
immediately before the \fBconfigure\fP command. |
. |
. |
. |
. |
|
.SH "DEBUGGING WITH VALGRIND SUPPORT" |
|
.rs |
|
.sp |
|
By adding the |
|
.sp |
|
--enable-valgrind |
|
.sp |
|
option to to the \fBconfigure\fP command, PCRE will use valgrind annotations |
|
to mark certain memory regions as unaddressable. This allows it to detect |
|
invalid memory accesses, and is mostly useful for debugging PCRE itself. |
|
. |
|
. |
|
.SH "CODE COVERAGE REPORTING" |
|
.rs |
|
.sp |
|
If your C compiler is gcc, you can build a version of PCRE that can generate a |
|
code coverage report for its test suite. To enable this, you must install |
|
\fBlcov\fP version 1.6 or above. Then specify |
|
.sp |
|
--enable-coverage |
|
.sp |
|
to the \fBconfigure\fP command and build PCRE in the usual way. |
|
.P |
|
Note that using \fBccache\fP (a caching C compiler) is incompatible with code |
|
coverage reporting. If you have configured \fBccache\fP to run automatically |
|
on your system, you must set the environment variable |
|
.sp |
|
CCACHE_DISABLE=1 |
|
.sp |
|
before running \fBmake\fP to build PCRE, so that \fBccache\fP is not used. |
|
.P |
|
When --enable-coverage is used, the following addition targets are added to the |
|
\fIMakefile\fP: |
|
.sp |
|
make coverage |
|
.sp |
|
This creates a fresh coverage report for the PCRE test suite. It is equivalent |
|
to running "make coverage-reset", "make coverage-baseline", "make check", and |
|
then "make coverage-report". |
|
.sp |
|
make coverage-reset |
|
.sp |
|
This zeroes the coverage counters, but does nothing else. |
|
.sp |
|
make coverage-baseline |
|
.sp |
|
This captures baseline coverage information. |
|
.sp |
|
make coverage-report |
|
.sp |
|
This creates the coverage report. |
|
.sp |
|
make coverage-clean-report |
|
.sp |
|
This removes the generated coverage report without cleaning the coverage data |
|
itself. |
|
.sp |
|
make coverage-clean-data |
|
.sp |
|
This removes the captured coverage data without removing the coverage files |
|
created at compile time (*.gcno). |
|
.sp |
|
make coverage-clean |
|
.sp |
|
This cleans all coverage data including the generated coverage report. For more |
|
information about code coverage, see the \fBgcov\fP and \fBlcov\fP |
|
documentation. |
|
. |
|
. |
.SH "SEE ALSO" |
.SH "SEE ALSO" |
.rs |
.rs |
.sp |
.sp |
\fBpcreapi\fP(3), \fBpcre16\fP, \fBpcre_config\fP(3). | \fBpcreapi\fP(3), \fBpcre16\fP, \fBpcre32\fP, \fBpcre_config\fP(3). |
. |
. |
. |
. |
.SH AUTHOR |
.SH AUTHOR |
Line 420 Cambridge CB2 3QH, England.
|
Line 545 Cambridge CB2 3QH, England.
|
.rs |
.rs |
.sp |
.sp |
.nf |
.nf |
Last updated: 07 January 2012 | Last updated: 12 May 2013 |
Copyright (c) 1997-2012 University of Cambridge. | Copyright (c) 1997-2013 University of Cambridge. |
.fi |
.fi |