version 1.1.1.1, 2012/02/21 23:05:51
|
version 1.1.1.3, 2012/10/09 09:19:17
|
Line 1
|
Line 1
|
.TH PCREBUILD 3 | .TH PCREBUILD 3 "07 January 2012" "PCRE 8.30" |
.SH NAME |
.SH NAME |
PCRE - Perl-compatible regular expressions |
PCRE - Perl-compatible regular expressions |
. |
. |
Line 32 The following sections include descriptions of options
|
Line 32 The following sections include descriptions of options
|
exists as well, but as it specifies the default, it is not described. |
exists as well, but as it specifies the default, it is not described. |
. |
. |
. |
. |
|
.SH "BUILDING 8-BIT and 16-BIT LIBRARIES" |
|
.rs |
|
.sp |
|
By default, a library called \fBlibpcre\fP is built, containing functions that |
|
take string arguments contained in vectors of bytes, either as single-byte |
|
characters, or interpreted as UTF-8 strings. You can also build a separate |
|
library, called \fBlibpcre16\fP, in which strings are contained in vectors of |
|
16-bit data units and interpreted either as single-unit characters or UTF-16 |
|
strings, by adding |
|
.sp |
|
--enable-pcre16 |
|
.sp |
|
to the \fBconfigure\fP command. If you do not want the 8-bit library, add |
|
.sp |
|
--disable-pcre8 |
|
.sp |
|
as well. At least one of the two libraries must be built. Note that the C++ and |
|
POSIX wrappers are for the 8-bit library only, and that \fBpcregrep\fP is an |
|
8-bit program. None of these are built if you select only the 16-bit library. |
|
. |
|
. |
.SH "BUILDING SHARED AND STATIC LIBRARIES" |
.SH "BUILDING SHARED AND STATIC LIBRARIES" |
.rs |
.rs |
.sp |
.sp |
Line 47 to the \fBconfigure\fP command, as required.
|
Line 68 to the \fBconfigure\fP command, as required.
|
.SH "C++ SUPPORT" |
.SH "C++ SUPPORT" |
.rs |
.rs |
.sp |
.sp |
By default, the \fBconfigure\fP script will search for a C++ compiler and C++ | By default, if the 8-bit library is being built, the \fBconfigure\fP script |
header files. If it finds them, it automatically builds the C++ wrapper library | will search for a C++ compiler and C++ header files. If it finds them, it |
for PCRE. You can disable this by adding | automatically builds the C++ wrapper library (which supports only 8-bit |
| strings). You can disable this by adding |
.sp |
.sp |
--disable-cpp |
--disable-cpp |
.sp |
.sp |
to the \fBconfigure\fP command. |
to the \fBconfigure\fP command. |
. |
. |
. |
. |
.SH "UTF-8 SUPPORT" | .SH "UTF-8 and UTF-16 SUPPORT" |
.rs |
.rs |
.sp |
.sp |
To build PCRE with support for UTF-8 Unicode character strings, add | To build PCRE with support for UTF Unicode character strings, add |
.sp |
.sp |
--enable-utf8 | --enable-utf |
.sp |
.sp |
to the \fBconfigure\fP command. Of itself, this does not make PCRE treat | to the \fBconfigure\fP command. This setting applies to both libraries, adding |
strings as UTF-8. As well as compiling PCRE with this option, you also have | support for UTF-8 to the 8-bit library and support for UTF-16 to the 16-bit |
have to set the PCRE_UTF8 option when you call the \fBpcre_compile()\fP | library. There are no separate options for enabling UTF-8 and UTF-16 |
or \fBpcre_compile2()\fP functions. | independently because that would allow ridiculous settings such as requesting |
| UTF-16 support while building only the 8-bit library. It is not possible to |
| build one library with UTF support and the other without in the same |
| configuration. (For backwards compatibility, --enable-utf8 is a synonym of |
| --enable-utf.) |
.P |
.P |
If you set --enable-utf8 when compiling in an EBCDIC environment, PCRE expects | Of itself, this setting does not make PCRE treat strings as UTF-8 or UTF-16. As |
its input to be either ASCII or UTF-8 (depending on the runtime option). It is | well as compiling PCRE with this option, you also have have to set the |
| PCRE_UTF8 or PCRE_UTF16 option when you call one of the pattern compiling |
| functions. |
| .P |
| If you set --enable-utf when compiling in an EBCDIC environment, PCRE expects |
| its input to be either ASCII or UTF-8 (depending on the run-time option). It is |
not possible to support both EBCDIC and UTF-8 codes in the same version of the |
not possible to support both EBCDIC and UTF-8 codes in the same version of the |
library. Consequently, --enable-utf8 and --enable-ebcdic are mutually | library. Consequently, --enable-utf and --enable-ebcdic are mutually |
exclusive. |
exclusive. |
. |
. |
. |
. |
.SH "UNICODE CHARACTER PROPERTY SUPPORT" |
.SH "UNICODE CHARACTER PROPERTY SUPPORT" |
.rs |
.rs |
.sp |
.sp |
UTF-8 support allows PCRE to process character values greater than 255 in the | UTF support allows the libraries to process character codepoints up to 0x10ffff |
strings that it handles. On its own, however, it does not provide any | in the strings that they handle. On its own, however, it does not provide any |
facilities for accessing the properties of such characters. If you want to be |
facilities for accessing the properties of such characters. If you want to be |
able to use the pattern escapes \eP, \ep, and \eX, which refer to Unicode |
able to use the pattern escapes \eP, \ep, and \eX, which refer to Unicode |
character properties, you must add |
character properties, you must add |
.sp |
.sp |
--enable-unicode-properties |
--enable-unicode-properties |
.sp |
.sp |
to the \fBconfigure\fP command. This implies UTF-8 support, even if you have | to the \fBconfigure\fP command. This implies UTF support, even if you have |
not explicitly requested it. |
not explicitly requested it. |
.P |
.P |
Including Unicode property support adds around 30K of tables to the PCRE |
Including Unicode property support adds around 30K of tables to the PCRE |
Line 168 called.
|
Line 199 called.
|
.SH "POSIX MALLOC USAGE" |
.SH "POSIX MALLOC USAGE" |
.rs |
.rs |
.sp |
.sp |
When PCRE is called through the POSIX interface (see the | When the 8-bit library is called through the POSIX interface (see the |
.\" HREF |
.\" HREF |
\fBpcreposix\fP |
\fBpcreposix\fP |
.\" |
.\" |
Line 193 another (for example, from an opening parenthesis to a
|
Line 224 another (for example, from an opening parenthesis to a
|
metacharacter). By default, two-byte values are used for these offsets, leading |
metacharacter). By default, two-byte values are used for these offsets, leading |
to a maximum size for a compiled pattern of around 64K. This is sufficient to |
to a maximum size for a compiled pattern of around 64K. This is sufficient to |
handle all but the most gigantic patterns. Nevertheless, some people do want to |
handle all but the most gigantic patterns. Nevertheless, some people do want to |
process truyl enormous patterns, so it is possible to compile PCRE to use | process truly enormous patterns, so it is possible to compile PCRE to use |
three-byte or four-byte offsets by adding a setting such as |
three-byte or four-byte offsets by adding a setting such as |
.sp |
.sp |
--with-link-size=3 |
--with-link-size=3 |
.sp |
.sp |
to the \fBconfigure\fP command. The value given must be 2, 3, or 4. Using | to the \fBconfigure\fP command. The value given must be 2, 3, or 4. For the |
longer offsets slows down the operation of PCRE because it has to load | 16-bit library, a value of 3 is rounded up to 4. Using longer offsets slows |
additional bytes when handling them. | down the operation of PCRE because it has to load additional data when handling |
| them. |
. |
. |
. |
. |
.SH "AVOIDING EXCESSIVE STACK USAGE" |
.SH "AVOIDING EXCESSIVE STACK USAGE" |
Line 281 only. If you add
|
Line 313 only. If you add
|
.sp |
.sp |
to the \fBconfigure\fP command, the distributed tables are no longer used. |
to the \fBconfigure\fP command, the distributed tables are no longer used. |
Instead, a program called \fBdftables\fP is compiled and run. This outputs the |
Instead, a program called \fBdftables\fP is compiled and run. This outputs the |
source for new set of tables, created in the default locale of your C runtime | source for new set of tables, created in the default locale of your C run-time |
system. (This method of replacing the tables does not work if you are cross |
system. (This method of replacing the tables does not work if you are cross |
compiling, because \fBdftables\fP is run on the local host. If you need to |
compiling, because \fBdftables\fP is run on the local host. If you need to |
create alternative tables when cross compiling, you will have to do so "by |
create alternative tables when cross compiling, you will have to do so "by |
Line 301 EBCDIC environment by adding
|
Line 333 EBCDIC environment by adding
|
to the \fBconfigure\fP command. This setting implies |
to the \fBconfigure\fP command. This setting implies |
--enable-rebuild-chartables. You should only use it if you know that you are in |
--enable-rebuild-chartables. You should only use it if you know that you are in |
an EBCDIC environment (for example, an IBM mainframe operating system). The |
an EBCDIC environment (for example, an IBM mainframe operating system). The |
--enable-ebcdic option is incompatible with --enable-utf8. | --enable-ebcdic option is incompatible with --enable-utf. |
. |
. |
. |
. |
.SH "PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT" |
.SH "PCREGREP OPTIONS FOR COMPRESSED FILE SUPPORT" |
Line 371 immediately before the \fBconfigure\fP command.
|
Line 403 immediately before the \fBconfigure\fP command.
|
.SH "SEE ALSO" |
.SH "SEE ALSO" |
.rs |
.rs |
.sp |
.sp |
\fBpcreapi\fP(3), \fBpcre_config\fP(3). | \fBpcreapi\fP(3), \fBpcre16\fP, \fBpcre_config\fP(3). |
. |
. |
. |
. |
.SH AUTHOR |
.SH AUTHOR |
Line 388 Cambridge CB2 3QH, England.
|
Line 420 Cambridge CB2 3QH, England.
|
.rs |
.rs |
.sp |
.sp |
.nf |
.nf |
Last updated: 06 September 2011 | Last updated: 07 January 2012 |
Copyright (c) 1997-2011 University of Cambridge. | Copyright (c) 1997-2012 University of Cambridge. |
.fi |
.fi |