version 1.1.1.1, 2012/02/21 23:50:25
|
version 1.1.1.3, 2013/07/22 08:25:56
|
Line 1
|
Line 1
|
.TH PCRE 3 | .TH PCRE 3 "12 May 2013" "PCRE 8.33" |
.SH NAME |
.SH NAME |
PCRE - Perl-compatible regular expressions |
PCRE - Perl-compatible regular expressions |
.sp |
.sp |
Line 170 library. For example, if you want to study a pattern t
|
Line 170 library. For example, if you want to study a pattern t
|
.rs |
.rs |
.sp |
.sp |
There is only one header file, \fBpcre.h\fP. It contains prototypes for all the |
There is only one header file, \fBpcre.h\fP. It contains prototypes for all the |
functions in both libraries, as well as definitions of flags, structures, error | functions in all libraries, as well as definitions of flags, structures, error |
codes, etc. |
codes, etc. |
. |
. |
. |
. |
Line 190 of bytes with the C type "char *". In the 16-bit libra
|
Line 190 of bytes with the C type "char *". In the 16-bit libra
|
vectors of unsigned 16-bit quantities. The macro PCRE_UCHAR16 specifies an |
vectors of unsigned 16-bit quantities. The macro PCRE_UCHAR16 specifies an |
appropriate data type, and PCRE_SPTR16 is defined as "const PCRE_UCHAR16 *". In |
appropriate data type, and PCRE_SPTR16 is defined as "const PCRE_UCHAR16 *". In |
very many environments, "short int" is a 16-bit data type. When PCRE is built, |
very many environments, "short int" is a 16-bit data type. When PCRE is built, |
it defines PCRE_UCHAR16 as "short int", but checks that it really is a 16-bit | it defines PCRE_UCHAR16 as "unsigned short int", but checks that it really is a |
data type. If it is not, the build fails with an error message telling the | 16-bit data type. If it is not, the build fails with an error message telling |
maintainer to modify the definition appropriately. | the maintainer to modify the definition appropriately. |
. |
. |
. |
. |
.SH "STRUCTURE TYPES" |
.SH "STRUCTURE TYPES" |
Line 246 buffer, including the zero terminator if the string wa
|
Line 246 buffer, including the zero terminator if the string wa
|
.SH "SUBJECT STRING OFFSETS" |
.SH "SUBJECT STRING OFFSETS" |
.rs |
.rs |
.sp |
.sp |
The offsets within subject strings that are returned by the matching functions | The lengths and starting offsets of subject strings must be specified in 16-bit |
are in 16-bit units rather than bytes. | data units, and the offsets within subject strings that are returned by the |
| matching functions are in also 16-bit units rather than bytes. |
. |
. |
. |
. |
.SH "NAMED SUBPATTERNS" |
.SH "NAMED SUBPATTERNS" |
Line 264 units.
|
Line 265 units.
|
.sp |
.sp |
There are two new general option names, PCRE_UTF16 and PCRE_NO_UTF16_CHECK, |
There are two new general option names, PCRE_UTF16 and PCRE_NO_UTF16_CHECK, |
which correspond to PCRE_UTF8 and PCRE_NO_UTF8_CHECK in the 8-bit library. In |
which correspond to PCRE_UTF8 and PCRE_NO_UTF8_CHECK in the 8-bit library. In |
fact, these new options define the same bits in the options word. | fact, these new options define the same bits in the options word. There is a |
| discussion about the |
| .\" HTML <a href="pcreunicode.html#utf16strings"> |
| .\" </a> |
| validity of UTF-16 strings |
| .\" |
| in the |
| .\" HREF |
| \fBpcreunicode\fP |
| .\" |
| page. |
.P |
.P |
For the \fBpcre16_config()\fP function there is an option PCRE_CONFIG_UTF16 |
For the \fBpcre16_config()\fP function there is an option PCRE_CONFIG_UTF16 |
that returns 1 if UTF-16 support is configured, otherwise 0. If this option is |
that returns 1 if UTF-16 support is configured, otherwise 0. If this option is |
given to \fBpcre_config()\fP, or if the PCRE_CONFIG_UTF8 option is given to | given to \fBpcre_config()\fP or \fBpcre32_config()\fP, or if the |
\fBpcre16_config()\fP, the result is the PCRE_ERROR_BADOPTION error. | PCRE_CONFIG_UTF8 or PCRE_CONFIG_UTF32 option is given to \fBpcre16_config()\fP, |
| the result is the PCRE_ERROR_BADOPTION error. |
. |
. |
. |
. |
.SH "CHARACTER CODES" |
.SH "CHARACTER CODES" |
Line 318 page. The UTF-16 errors are:
|
Line 330 page. The UTF-16 errors are:
|
PCRE_UTF16_ERR1 Missing low surrogate at end of string |
PCRE_UTF16_ERR1 Missing low surrogate at end of string |
PCRE_UTF16_ERR2 Invalid low surrogate follows high surrogate |
PCRE_UTF16_ERR2 Invalid low surrogate follows high surrogate |
PCRE_UTF16_ERR3 Isolated low surrogate |
PCRE_UTF16_ERR3 Isolated low surrogate |
PCRE_UTF16_ERR4 Invalid character 0xfffe | PCRE_UTF16_ERR4 Non-character |
. |
. |
. |
. |
.SH "ERROR TEXTS" |
.SH "ERROR TEXTS" |
Line 344 files, but it can be used for testing the 16-bit libra
|
Line 356 files, but it can be used for testing the 16-bit libra
|
command line option \fB-16\fP, patterns and subject strings are converted from |
command line option \fB-16\fP, patterns and subject strings are converted from |
8-bit to 16-bit before being passed to PCRE, and the 16-bit library functions |
8-bit to 16-bit before being passed to PCRE, and the 16-bit library functions |
are used instead of the 8-bit ones. Returned 16-bit strings are converted to |
are used instead of the 8-bit ones. Returned 16-bit strings are converted to |
8-bit for output. If the 8-bit library was not compiled, \fBpcretest\fP | 8-bit for output. If both the 8-bit and the 32-bit libraries were not compiled, |
defaults to 16-bit and the \fB-16\fP option is ignored. | \fBpcretest\fP defaults to 16-bit and the \fB-16\fP option is ignored. |
.P |
.P |
When PCRE is being built, the \fBRunTest\fP script that is called by "make |
When PCRE is being built, the \fBRunTest\fP script that is called by "make |
check" uses the \fBpcretest\fP \fB-C\fP option to discover which of the 8-bit | check" uses the \fBpcretest\fP \fB-C\fP option to discover which of the 8-bit, |
and 16-bit libraries has been built, and runs the tests appropriately. | 16-bit and 32-bit libraries has been built, and runs the tests appropriately. |
. |
. |
. |
. |
.SH "NOT SUPPORTED IN 16-BIT MODE" |
.SH "NOT SUPPORTED IN 16-BIT MODE" |
Line 374 Cambridge CB2 3QH, England.
|
Line 386 Cambridge CB2 3QH, England.
|
.rs |
.rs |
.sp |
.sp |
.nf |
.nf |
Last updated: 08 January 2012 | Last updated: 12 May 2013 |
Copyright (c) 1997-2012 University of Cambridge. | Copyright (c) 1997-2013 University of Cambridge. |
.fi |
.fi |