--- embedaddon/pcre/doc/html/pcresyntax.html 2013/07/22 08:25:57 1.1.1.4 +++ embedaddon/pcre/doc/html/pcresyntax.html 2014/06/15 19:46:05 1.1.1.5 @@ -65,10 +65,14 @@ documentation. This document contains a quick-referenc \n newline (hex 0A) \r carriage return (hex 0D) \t tab (hex 09) + \0dd character with octal code 0dd \ddd character with octal code ddd, or backreference + \o{ddd..} character with octal code ddd.. \xhh character with hex code hh \x{hhh..} character with hex code hhh.. - + +Note that \0dd is always an octal code, and that \8 and \9 are the literal +characters "8" and "9".


CHARACTER TYPES

@@ -92,9 +96,11 @@ documentation. This document contains a quick-referenc \W a "non-word" character \X a Unicode extended grapheme cluster -In PCRE, by default, \d, \D, \s, \S, \w, and \W recognize only ASCII -characters, even in a UTF mode. However, this can be changed by setting the -PCRE_UCP option. +By default, \d, \s, and \w match only ASCII characters, even in UTF-8 mode +or in the 16- bit and 32-bit libraries. However, if locale-specific matching is +happening, \s and \w may also match characters with code points in the range +128-255. If the PCRE_UCP option is set, the behaviour of these escape sequences +is changed to use Unicode properties and they match many more characters.


GENERAL CATEGORY PROPERTIES FOR \p and \P

@@ -150,11 +156,13 @@ PCRE_UCP option.

   Xan        Alphanumeric: union of properties L and N
   Xps        POSIX space: property Z or tab, NL, VT, FF, CR
-  Xsp        Perl space: property Z or tab, NL, FF, CR
+  Xsp        Perl space: property Z or tab, NL, VT, FF, CR
   Xuc        Univerally-named character: one that can be
                represented by a Universal Character Name
   Xwd        Perl word: property Xan or underscore
-
+ +Perl and POSIX space are now the same. Perl added VT to its space character set +at release 5.18 and PCRE changed at release 8.34.


SCRIPT NAMES FOR \p AND \P

@@ -385,7 +393,9 @@ newline-setting options with similar syntax: (*UTF32) set UTF-32 mode: 32-bit library (PCRE_UTF32) (*UTF) set appropriate UTF mode for the library in use (*UCP) set PCRE_UCP (use Unicode properties for \d etc) - + +Note that LIMIT_MATCH and LIMIT_RECURSION can only reduce the value of the +limits set by the caller of pcre_exec(), not increase them.


LOOKAHEAD AND LOOKBEHIND ASSERTIONS

@@ -516,7 +526,7 @@ Cambridge CB2 3QH, England.


REVISION

-Last updated: 26 April 2013 +Last updated: 12 November 2013
Copyright © 1997-2013 University of Cambridge.