version 1.1.1.1, 2012/02/21 23:05:51
|
version 1.1.1.4, 2013/07/22 08:25:56
|
Line 1
|
Line 1
|
.TH PCREBUILD 3 | .TH PCREBUILD 3 "12 May 2013" "PCRE 8.33" |
.SH NAME |
.SH NAME |
PCRE - Perl-compatible regular expressions |
PCRE - Perl-compatible regular expressions |
. |
. |
. |
. |
|
.SH "BUILDING PCRE" |
|
.rs |
|
.sp |
|
PCRE is distributed with a \fBconfigure\fP script that can be used to build the |
|
library in Unix-like environments using the applications known as Autotools. |
|
Also in the distribution are files to support building using \fBCMake\fP |
|
instead of \fBconfigure\fP. The text file |
|
.\" HTML <a href="README.txt"> |
|
.\" </a> |
|
\fBREADME\fP |
|
.\" |
|
contains general information about building with Autotools (some of which is |
|
repeated below), and also has some comments about building on various operating |
|
systems. There is a lot more information about building PCRE without using |
|
Autotools (including information about using \fBCMake\fP and building "by |
|
hand") in the text file called |
|
.\" HTML <a href="NON-AUTOTOOLS-BUILD.txt"> |
|
.\" </a> |
|
\fBNON-AUTOTOOLS-BUILD\fP. |
|
.\" |
|
You should consult this file as well as the |
|
.\" HTML <a href="README.txt"> |
|
.\" </a> |
|
\fBREADME\fP |
|
.\" |
|
file if you are building in a non-Unix-like environment. |
|
. |
|
. |
.SH "PCRE BUILD-TIME OPTIONS" |
.SH "PCRE BUILD-TIME OPTIONS" |
.rs |
.rs |
.sp |
.sp |
This document describes the optional features of PCRE that can be selected when | The rest of this document describes the optional features of PCRE that can be |
the library is compiled. It assumes use of the \fBconfigure\fP script, where | selected when the library is compiled. It assumes use of the \fBconfigure\fP |
the optional features are selected or deselected by providing options to | script, where the optional features are selected or deselected by providing |
\fBconfigure\fP before running the \fBmake\fP command. However, the same | options to \fBconfigure\fP before running the \fBmake\fP command. However, the |
options can be selected in both Unix-like and non-Unix-like environments using | same options can be selected in both Unix-like and non-Unix-like environments |
the GUI facility of \fBcmake-gui\fP if you are using \fBCMake\fP instead of | using the GUI facility of \fBcmake-gui\fP if you are using \fBCMake\fP instead |
\fBconfigure\fP to build PCRE. | of \fBconfigure\fP to build PCRE. |
.P |
.P |
There is a lot more information about building PCRE in non-Unix-like | If you are not using Autotools or \fBCMake\fP, option selection can be done by |
environments in the file called \fINON_UNIX_USE\fP, which is part of the PCRE | editing the \fBconfig.h\fP file, or by passing parameter settings to the |
distribution. You should consult this file as well as the \fIREADME\fP file if | compiler, as described in |
you are building in a non-Unix-like environment. | .\" HTML <a href="NON-AUTOTOOLS-BUILD.txt"> |
| .\" </a> |
| \fBNON-AUTOTOOLS-BUILD\fP. |
| .\" |
.P |
.P |
The complete list of options for \fBconfigure\fP (which includes the standard |
The complete list of options for \fBconfigure\fP (which includes the standard |
ones such as the selection of the installation directory) can be obtained by |
ones such as the selection of the installation directory) can be obtained by |
Line 32 The following sections include descriptions of options
|
Line 63 The following sections include descriptions of options
|
exists as well, but as it specifies the default, it is not described. |
exists as well, but as it specifies the default, it is not described. |
. |
. |
. |
. |
|
.SH "BUILDING 8-BIT, 16-BIT AND 32-BIT LIBRARIES" |
|
.rs |
|
.sp |
|
By default, a library called \fBlibpcre\fP is built, containing functions that |
|
take string arguments contained in vectors of bytes, either as single-byte |
|
characters, or interpreted as UTF-8 strings. You can also build a separate |
|
library, called \fBlibpcre16\fP, in which strings are contained in vectors of |
|
16-bit data units and interpreted either as single-unit characters or UTF-16 |
|
strings, by adding |
|
.sp |
|
--enable-pcre16 |
|
.sp |
|
to the \fBconfigure\fP command. You can also build yet another separate |
|
library, called \fBlibpcre32\fP, in which strings are contained in vectors of |
|
32-bit data units and interpreted either as single-unit characters or UTF-32 |
|
strings, by adding |
|
.sp |
|
--enable-pcre32 |
|
.sp |
|
to the \fBconfigure\fP command. If you do not want the 8-bit library, add |
|
.sp |
|
--disable-pcre8 |
|
.sp |
|
as well. At least one of the three libraries must be built. Note that the C++ |
|
and POSIX wrappers are for the 8-bit library only, and that \fBpcregrep\fP is |
|
an 8-bit program. None of these are built if you select only the 16-bit or |
|
32-bit libraries. |
|
. |
|
. |
.SH "BUILDING SHARED AND STATIC LIBRARIES" |
.SH "BUILDING SHARED AND STATIC LIBRARIES" |
.rs |
.rs |
.sp |
.sp |
The PCRE building process uses \fBlibtool\fP to build both shared and static | The Autotools PCRE building process uses \fBlibtool\fP to build both shared and |
Unix libraries by default. You can suppress one of these by adding one of | static libraries by default. You can suppress one of these by adding one of |
.sp |
.sp |
--disable-shared |
--disable-shared |
--disable-static |
--disable-static |
Line 47 to the \fBconfigure\fP command, as required.
|
Line 107 to the \fBconfigure\fP command, as required.
|
.SH "C++ SUPPORT" |
.SH "C++ SUPPORT" |
.rs |
.rs |
.sp |
.sp |
By default, the \fBconfigure\fP script will search for a C++ compiler and C++ | By default, if the 8-bit library is being built, the \fBconfigure\fP script |
header files. If it finds them, it automatically builds the C++ wrapper library | will search for a C++ compiler and C++ header files. If it finds them, it |
for PCRE. You can disable this by adding | automatically builds the C++ wrapper library (which supports only 8-bit |
| strings). You can disable this by adding |
.sp |
.sp |
--disable-cpp |
--disable-cpp |
.sp |
.sp |
to the \fBconfigure\fP command. |
to the \fBconfigure\fP command. |
. |
. |
. |
. |
.SH "UTF-8 SUPPORT" | .SH "UTF-8, UTF-16 AND UTF-32 SUPPORT" |
.rs |
.rs |
.sp |
.sp |
To build PCRE with support for UTF-8 Unicode character strings, add | To build PCRE with support for UTF Unicode character strings, add |
.sp |
.sp |
--enable-utf8 | --enable-utf |
.sp |
.sp |
to the \fBconfigure\fP command. Of itself, this does not make PCRE treat | to the \fBconfigure\fP command. This setting applies to all three libraries, |
strings as UTF-8. As well as compiling PCRE with this option, you also have | adding support for UTF-8 to the 8-bit library, support for UTF-16 to the 16-bit |
have to set the PCRE_UTF8 option when you call the \fBpcre_compile()\fP | library, and support for UTF-32 to the to the 32-bit library. There are no |
or \fBpcre_compile2()\fP functions. | separate options for enabling UTF-8, UTF-16 and UTF-32 independently because |
| that would allow ridiculous settings such as requesting UTF-16 support while |
| building only the 8-bit library. It is not possible to build one library with |
| UTF support and another without in the same configuration. (For backwards |
| compatibility, --enable-utf8 is a synonym of --enable-utf.) |
.P |
.P |
If you set --enable-utf8 when compiling in an EBCDIC environment, PCRE expects | Of itself, this setting does not make PCRE treat strings as UTF-8, UTF-16 or |
its input to be either ASCII or UTF-8 (depending on the runtime option). It is | UTF-32. As well as compiling PCRE with this option, you also have have to set |
| the PCRE_UTF8, PCRE_UTF16 or PCRE_UTF32 option (as appropriate) when you call |
| one of the pattern compiling functions. |
| .P |
| If you set --enable-utf when compiling in an EBCDIC environment, PCRE expects |
| its input to be either ASCII or UTF-8 (depending on the run-time option). It is |
not possible to support both EBCDIC and UTF-8 codes in the same version of the |
not possible to support both EBCDIC and UTF-8 codes in the same version of the |
library. Consequently, --enable-utf8 and --enable-ebcdic are mutually | library. Consequently, --enable-utf and --enable-ebcdic are mutually |
exclusive. |
exclusive. |
. |
. |
. |
. |
.SH "UNICODE CHARACTER PROPERTY SUPPORT" |
.SH "UNICODE CHARACTER PROPERTY SUPPORT" |
.rs |
.rs |
.sp |
.sp |
UTF-8 support allows PCRE to process character values greater than 255 in the | UTF support allows the libraries to process character codepoints up to 0x10ffff |
strings that it handles. On its own, however, it does not provide any | in the strings that they handle. On its own, however, it does not provide any |
facilities for accessing the properties of such characters. If you want to be |
facilities for accessing the properties of such characters. If you want to be |
able to use the pattern escapes \eP, \ep, and \eX, which refer to Unicode |
able to use the pattern escapes \eP, \ep, and \eX, which refer to Unicode |
character properties, you must add |
character properties, you must add |
.sp |
.sp |
--enable-unicode-properties |
--enable-unicode-properties |
.sp |
.sp |
to the \fBconfigure\fP command. This implies UTF-8 support, even if you have | to the \fBconfigure\fP command. This implies UTF support, even if you have |
not explicitly requested it. |
not explicitly requested it. |
.P |
.P |
Including Unicode property support adds around 30K of tables to the PCRE |
Including Unicode property support adds around 30K of tables to the PCRE |
Line 168 called.
|
Line 238 called.
|
.SH "POSIX MALLOC USAGE" |
.SH "POSIX MALLOC USAGE" |
.rs |
.rs |
.sp |
.sp |
When PCRE is called through the POSIX interface (see the | When the 8-bit library is called through the POSIX interface (see the |
.\" HREF |
.\" HREF |
\fBpcreposix\fP |
\fBpcreposix\fP |
.\" |
.\" |
Line 190 to the \fBconfigure\fP command.
|
Line 260 to the \fBconfigure\fP command.
|
.sp |
.sp |
Within a compiled pattern, offset values are used to point from one part to |
Within a compiled pattern, offset values are used to point from one part to |
another (for example, from an opening parenthesis to an alternation |
another (for example, from an opening parenthesis to an alternation |
metacharacter). By default, two-byte values are used for these offsets, leading | metacharacter). By default, in the 8-bit and 16-bit libraries, two-byte values |
to a maximum size for a compiled pattern of around 64K. This is sufficient to | are used for these offsets, leading to a maximum size for a compiled pattern of |
handle all but the most gigantic patterns. Nevertheless, some people do want to | around 64K. This is sufficient to handle all but the most gigantic patterns. |
process truyl enormous patterns, so it is possible to compile PCRE to use | Nevertheless, some people do want to process truly enormous patterns, so it is |
three-byte or four-byte offsets by adding a setting such as | possible to compile PCRE to use three-byte or four-byte offsets by adding a |
| setting such as |
.sp |
.sp |
--with-link-size=3 |
--with-link-size=3 |
.sp |
.sp |
to the \fBconfigure\fP command. The value given must be 2, 3, or 4. Using | to the \fBconfigure\fP command. The value given must be 2, 3, or 4. For the |
| 16-bit library, a value of 3 is rounded up to 4. In these libraries, using |
longer offsets slows down the operation of PCRE because it has to load |
longer offsets slows down the operation of PCRE because it has to load |
additional bytes when handling them. | additional data when handling them. For the 32-bit library the value is always |
| 4 and cannot be overridden; the value of --with-link-size is ignored. |
. |
. |
. |
. |
.SH "AVOIDING EXCESSIVE STACK USAGE" |
.SH "AVOIDING EXCESSIVE STACK USAGE" |
Line 281 only. If you add
|
Line 354 only. If you add
|
.sp |
.sp |
to the \fBconfigure\fP command, the distributed tables are no longer used. |
to the \fBconfigure\fP command, the distributed tables are no longer used. |
Instead, a program called \fBdftables\fP is compiled and run. This outputs the |
Instead, a program called \fBdftables\fP is compiled and run. This outputs the |
source for new set of tables, created in the default locale of your C runtime | source for new set of tables, created in the default locale of your C run-time |
system. (This method of replacing the tables does not work if you are cross |
system. (This method of replacing the tables does not work if you are cross |
compiling, because \fBdftables\fP is run on the local host. If you need to |
compiling, because \fBdftables\fP is run on the local host. If you need to |
create alternative tables when cross compiling, you will have to do so "by |
create alternative tables when cross compiling, you will have to do so "by |
Line 301 EBCDIC environment by adding
|
Line 374 EBCDIC environment by adding
|
to the \fBconfigure\fP command. This setting implies |
to the \fBconfigure\fP command. This setting implies |
--enable-rebuild-chartables. You should only use it if you know that you are in |
--enable-rebuild-chartables. You should only use it if you know that you are in |
an EBCDIC environment (for example, an IBM mainframe operating system). The |
an EBCDIC environment (for example, an IBM mainframe operating system). The |
--enable-ebcdic option is incompatible with --enable-utf8. | --enable-ebcdic option is incompatible with --enable-utf. |
| .P |
| The EBCDIC character that corresponds to an ASCII LF is assumed to have the |
| value 0x15 by default. However, in some EBCDIC environments, 0x25 is used. In |
| such an environment you should use |
| .sp |
| --enable-ebcdic-nl25 |
| .sp |
| as well as, or instead of, --enable-ebcdic. The EBCDIC character for CR has the |
| same value as in ASCII, namely, 0x0d. Whichever of 0x15 and 0x25 is \fInot\fP |
| chosen as LF is made to correspond to the Unicode NEL character (which, in |
| Unicode, is 0x85). |
| .P |
| The options that select newline behaviour, such as --enable-newline-is-cr, |
| and equivalent run-time options, refer to these character values in an EBCDIC |
| environment. |
. |
. |
. |
. |
.SH "PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT" |
.SH "PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT" |
Line 368 automatically included, you may need to add something
|
Line 456 automatically included, you may need to add something
|
immediately before the \fBconfigure\fP command. |
immediately before the \fBconfigure\fP command. |
. |
. |
. |
. |
|
.SH "DEBUGGING WITH VALGRIND SUPPORT" |
|
.rs |
|
.sp |
|
By adding the |
|
.sp |
|
--enable-valgrind |
|
.sp |
|
option to to the \fBconfigure\fP command, PCRE will use valgrind annotations |
|
to mark certain memory regions as unaddressable. This allows it to detect |
|
invalid memory accesses, and is mostly useful for debugging PCRE itself. |
|
. |
|
. |
|
.SH "CODE COVERAGE REPORTING" |
|
.rs |
|
.sp |
|
If your C compiler is gcc, you can build a version of PCRE that can generate a |
|
code coverage report for its test suite. To enable this, you must install |
|
\fBlcov\fP version 1.6 or above. Then specify |
|
.sp |
|
--enable-coverage |
|
.sp |
|
to the \fBconfigure\fP command and build PCRE in the usual way. |
|
.P |
|
Note that using \fBccache\fP (a caching C compiler) is incompatible with code |
|
coverage reporting. If you have configured \fBccache\fP to run automatically |
|
on your system, you must set the environment variable |
|
.sp |
|
CCACHE_DISABLE=1 |
|
.sp |
|
before running \fBmake\fP to build PCRE, so that \fBccache\fP is not used. |
|
.P |
|
When --enable-coverage is used, the following addition targets are added to the |
|
\fIMakefile\fP: |
|
.sp |
|
make coverage |
|
.sp |
|
This creates a fresh coverage report for the PCRE test suite. It is equivalent |
|
to running "make coverage-reset", "make coverage-baseline", "make check", and |
|
then "make coverage-report". |
|
.sp |
|
make coverage-reset |
|
.sp |
|
This zeroes the coverage counters, but does nothing else. |
|
.sp |
|
make coverage-baseline |
|
.sp |
|
This captures baseline coverage information. |
|
.sp |
|
make coverage-report |
|
.sp |
|
This creates the coverage report. |
|
.sp |
|
make coverage-clean-report |
|
.sp |
|
This removes the generated coverage report without cleaning the coverage data |
|
itself. |
|
.sp |
|
make coverage-clean-data |
|
.sp |
|
This removes the captured coverage data without removing the coverage files |
|
created at compile time (*.gcno). |
|
.sp |
|
make coverage-clean |
|
.sp |
|
This cleans all coverage data including the generated coverage report. For more |
|
information about code coverage, see the \fBgcov\fP and \fBlcov\fP |
|
documentation. |
|
. |
|
. |
.SH "SEE ALSO" |
.SH "SEE ALSO" |
.rs |
.rs |
.sp |
.sp |
\fBpcreapi\fP(3), \fBpcre_config\fP(3). | \fBpcreapi\fP(3), \fBpcre16\fP, \fBpcre32\fP, \fBpcre_config\fP(3). |
. |
. |
. |
. |
.SH AUTHOR |
.SH AUTHOR |
Line 388 Cambridge CB2 3QH, England.
|
Line 545 Cambridge CB2 3QH, England.
|
.rs |
.rs |
.sp |
.sp |
.nf |
.nf |
Last updated: 06 September 2011 | Last updated: 12 May 2013 |
Copyright (c) 1997-2011 University of Cambridge. | Copyright (c) 1997-2013 University of Cambridge. |
.fi |
.fi |