Diff for /embedaddon/pcre/doc/pcretest.txt between versions 1.1.1.2 and 1.1.1.3

version 1.1.1.2, 2012/02/21 23:50:25 version 1.1.1.3, 2012/10/09 09:19:17
Line 111  COMMAND LINE OPTIONS Line 111  COMMAND LINE OPTIONS
                  size megabytes.                   size megabytes.
   
        -s or -s+ Behave as if each pattern  has  the  /S  modifier;  in  other         -s or -s+ Behave as if each pattern  has  the  /S  modifier;  in  other
                 words,  force each pattern to be studied. If -s+ is used, the                 words,  force each pattern to be studied. If -s+ is used, all
                 PCRE_STUDY_JIT_COMPILE flag is  passed  to  pcre[16]_study(),                 the JIT compile options are passed to pcre[16]_study(), caus-
                 causing  just-in-time  optimization  to  be  set  up if it is                 ing  just-in-time  optimization  to be set up if it is avail-
                 available. If the /I or /D option is  present  on  a  pattern                 able, for both full and partial matching. Specific  JIT  com-
                 (requesting  output  about the compiled pattern), information                 pile options can be selected by following -s+ with a digit in
                 about the result of studying is not included when studying is                 the range 1 to 7, which selects the JIT compile modes as fol-
                 caused  only  by  -s  and neither -i nor -d is present on the                 lows:
                 command line. This behaviour means that the output from tests 
                 that  are run with and without -s should be identical, except 
                 when options that output information about the actual running 
                 of a match are set. 
   
                 The  -M,  -t,  and  -tm options, which give information about                   1  normal match only
                 resources used, are likely to produce different  output  with                   2  soft partial match only
                 and  without  -s.  Output may also differ if the /C option is                   3  normal match and soft partial match
                 present on an individual pattern. This uses callouts to trace                   4  hard partial match only
                 the  the  matching process, and this may be different between                   6  soft and hard partial match
                 studied and non-studied patterns.  If  the  pattern  contains                   7  all three modes (default)
                 (*MARK)  items  there  may  also be differences, for the same 
                 reason. The -s command line option can be overridden for spe- 
                 cific  patterns that should never be studied (see the /S pat- 
                 tern modifier below). 
   
       -t        Run each compile, study, and match many times with  a  timer,                 If  -s++  is used instead of -s+ (with or without a following
                 and  output resulting time per compile or match (in millisec-                 digit), the text "(JIT)" is added to the  first  output  line
                 onds). Do not set -m with -t, because you will then  get  the                 after a match or no match when JIT-compiled code was actually
                 size  output  a  zillion  times,  and the timing will be dis-                 used.
                 torted. You can control the number  of  iterations  that  are
                 used  for timing by following -t with a number (as a separate       If the /I or /D option is present on a pattern (requesting output about
        the  compiled pattern), information about the result of studying is not
        included when studying is caused only by -s and neither -i  nor  -d  is
        present  on the command line. This behaviour means that the output from
        tests that are run with and without -s should be identical, except when
        options that output information about the actual running of a match are
        set.
 
        The -M, -t, and -tm options, which  give  information  about  resources
        used,  are likely to produce different output with and without -s. Out-
        put may also differ if the /C option is present on an  individual  pat-
        tern.  This  uses  callouts to trace the the matching process, and this
        may be different between studied and non-studied patterns. If the  pat-
        tern contains (*MARK) items there may also be differences, for the same
        reason. The -s command line option can be overridden for specific  pat-
        terns that should never be studied (see the /S pattern modifier below).
 
        -t        Run  each  compile, study, and match many times with a timer,
                  and output resulting time per compile or match (in  millisec-
                  onds).  Do  not set -m with -t, because you will then get the
                  size output a zillion times, and  the  timing  will  be  dis-
                  torted.  You  can  control  the number of iterations that are
                  used for timing by following -t with a number (as a  separate
                  item on the command line). For example, "-t 1000" would iter-                   item on the command line). For example, "-t 1000" would iter-
                  ate 1000 times. The default is to iterate 500000 times.                   ate 1000 times. The default is to iterate 500000 times.
   
Line 149  COMMAND LINE OPTIONS Line 163  COMMAND LINE OPTIONS
   
 DESCRIPTION  DESCRIPTION
   
       If pcretest is given two filename arguments, it reads  from  the  first       If  pcretest  is  given two filename arguments, it re       If  pcretest  is  given two filename arguments, it reads from the first
        and writes to the second. If it is given only one filename argument, it         and writes to the second. If it is given only one filename argument, it
       reads from that file and writes to stdout.  Otherwise,  it  reads  from       reads  from  that  file  and writes to stdout. Otherwise, it reads from
       stdin  and  writes to stdout, and prompts for each line of input, using       stdin and writes to stdout, and prompts for each line of  input,  using
        "re>" to prompt for regular expressions, and "data>" to prompt for data         "re>" to prompt for regular expressions, and "data>" to prompt for data
        lines.         lines.
   
       When  pcretest  is  built,  a  configuration option can specify that it       When pcretest is built, a configuration  option  can  specify  that  it
       should be linked with the libreadline library. When this  is  done,  if       should  be  linked  with the libreadline library. When this is done, if
        the input is from a terminal, it is read using the readline() function.         the input is from a terminal, it is read using the readline() function.
       This provides line-editing and history facilities. The output from  the       This  provides line-editing and history facilities. The output from the
        -help option states whether or not readline() will be used.         -help option states whether or not readline() will be used.
   
        The program handles any number of sets of input on a single input file.         The program handles any number of sets of input on a single input file.
       Each set starts with a regular expression, and continues with any  num-       Each  set starts with a regular expression, and continues with any num-
        ber of data lines to be matched against the pattern.         ber of data lines to be matched against the pattern.
   
       Each  data line is matched separately and independently. If you want to       Each data line is matched separately and independently. If you want  to
        do multi-line matches, you have to use the \n escape sequence (or \r or         do multi-line matches, you have to use the \n escape sequence (or \r or
        \r\n, etc., depending on the newline setting) in a single line of input         \r\n, etc., depending on the newline setting) in a single line of input
       to encode the newline sequences. There is no limit  on  the  length  of       to  encode  the  newline  sequences. There is no limit on the length of
       data  lines;  the  input  buffer is automatically extended if it is too       data lines; the input buffer is automatically extended  if  it  is  too
        small.         small.
   
       An empty line signals the end of the data lines, at which point  a  new       An  empty  line signals the end of the data lines, at which point a new
       regular  expression is read. The regular expressions are given enclosed       regular expression is read. The regular expressions are given  enclosed
        in any non-alphanumeric delimiters other than backslash, for example:         in any non-alphanumeric delimiters other than backslash, for example:
   
          /(a|bc)x+yz/           /(a|bc)x+yz/
   
       White space before the initial delimiter is ignored. A regular  expres-       White  space before the initial delimiter is ignored. A regular expres-
       sion  may be continued over several input lines, in which case the new-       sion may be continued over several input lines, in which case the  new-
       line characters are included within it. It is possible to  include  the       line  characters  are included within it. It is possible to include the
        delimiter within the pattern by escaping it, for example         delimiter within the pattern by escaping it, for example
   
          /abc\/def/           /abc\/def/
   
       If  you  do  so, the escape and the delimiter form part of the pattern,       If you do so, the escape and the delimiter form part  of  the  pattern,
       but since delimiters are always non-alphanumeric, this does not  affect       but  since delimiters are always non-alphanumeric, this does not affect
       its  interpretation.   If the terminating delimiter is immediately fol-       its interpretation.  If the terminating delimiter is  immediately  fol-
        lowed by a backslash, for example,         lowed by a backslash, for example,
   
          /abc/\           /abc/\
   
       then a backslash is added to the end of the pattern. This  is  done  to       then  a  backslash  is added to the end of the pattern. This is done to
       provide  a  way of testing the error condition that arises if a pattern       provide a way of testing the error condition that arises if  a  pattern
        finishes with a backslash, because         finishes with a backslash, because
   
          /abc\/           /abc\/
   
       is interpreted as the first line of a pattern that starts with  "abc/",       is  interpreted as the first line of a pattern that starts wit"abc/",
        causing pcretest to read the next line as a continuation of the regular         causing pcretest to read the next line as a continuation of the regular
        expression.         expression.
   
   
 PATTERN MODIFIERS  PATTERN MODIFIERS
   
       A pattern may be followed by any number of modifiers, which are  mostly       A  pattern may be followed by any number of modifiers, which are mostly
       single  characters.  Following  Perl usage, these are referred to below       single characters. Following Perl usage, these are  referred  to  below
       as, for example, "the /i modifier", even though the  delimiter  of  the       as,  for  example,  "the /i modifier", even though the delimiter of the
       pattern  need  not always be a slash, and no slash is used when writing       pattern need not always be a slash, and no slash is used  when  writing
       modifiers. White space may appear between the final  pattern  delimiter       modifiers.  White  space may appear between the final pattern delimiter
        and the first modifier, and between the modifiers themselves.         and the first modifier, and between the modifiers themselves.
   
        The /i, /m, /s, and /x modifiers set the PCRE_CASELESS, PCRE_MULTILINE,         The /i, /m, /s, and /x modifiers set the PCRE_CASELESS, PCRE_MULTILINE,
        PCRE_DOTALL, or PCRE_EXTENDED options, respectively, when pcre[16]_com-         PCRE_DOTALL, or PCRE_EXTENDED options, respectively, when pcre[16]_com-
       pile()  is  called. These four modifier letters have the same effect as       pile() is called. These four modifier letters have the same  effect  as
        they do in Perl. For example:         they do in Perl. For example:
   
          /caseless/i           /caseless/i
   
       The following table shows additional modifiers for  setting  PCRE  com-       The  following  table  shows additional modifiers for setting PCR       The  following  table  shows additional modifiers for setting PCR
        pile-time options that do not correspond to anything in Perl:         pile-time options that do not correspond to anything in Perl:
   
          /8              PCRE_UTF8           ) when using the 8-bit           /8              PCRE_UTF8           ) when using the 8-bit
Line 248  PATTERN MODIFIERS Line 262  PATTERN MODIFIERS
          /<bsr_anycrlf>  PCRE_BSR_ANYCRLF           /<bsr_anycrlf>  PCRE_BSR_ANYCRLF
          /<bsr_unicode>  PCRE_BSR_UNICODE           /<bsr_unicode>  PCRE_BSR_UNICODE
   
       The  modifiers  that are enclosed in angle brackets are literal strings       The modifiers that are enclosed in angle brackets are  literal  strings
       as shown, including the angle brackets, but the letters within  can  be       as  shown,  including the angle brackets, but the letters within can be
       in  either case.  This example sets multiline matching with CRLF as the       in either case.  This example sets multiline matching with CRLF as  the
        line ending sequence:         line ending sequence:
   
          /^abc/m<CRLF>           /^abc/m<CRLF>
   
       As well as turning on the PCRE_UTF8/16 option, the /8  modifier  causes       As  well  as turning on the PCRE_UTF8/16 option, the /8 modifier causes
       all  non-printing  characters in output strings to be printed using the       all non-printing characters in output strings to be printed  using  the
       \x{hh...} notation. Otherwise, those less than 0x100 are output in  hex       \x{hh...}  notation. Otherwise, those less than 0x100 are output in hex
        without the curly brackets.         without the curly brackets.
   
       Full  details  of  the PCRE options are given in the pcreapi documenta-       Full details of the PCRE options are given in  the  pcreapi  documenta-
        tion.         tion.
   
    Finding all matches in a string     Finding all matches in a string
   
       Searching for all possible matches within each subject  string  can  be       Searching  for  all  possible matches within each subject string can be
       requested  by  the  /g  or  /G modifier. After finding a match, PCRE is       requested by the /g or /G modifier. After  finding  a  match,  PCRE  is
        called again to search the remainder of the subject string. The differ-         called again to search the remainder of the subject string. The differ-
        ence between /g and /G is that the former uses the startoffset argument         ence between /g and /G is that the former uses the startoffset argument
       to pcre[16]_exec() to start searching at a new point within the  entire       to  pcre[16]_exec() to start searching at a new point within the entire
       string  (which  is in effect what Perl does), whereas the latter passes       string (which is in effect what Perl does), whereas the  latter  passes
       over a shortened substring. This makes a  difference  to  the  matching       over  a  shortened  substring.  This makes a difference to the matching
        process if the pattern begins with a lookbehind assertion (including \b         process if the pattern begins with a lookbehind assertion (including \b
        or \B).         or \B).
   
       If any call to pcre[16]_exec() in a /g or /G sequence matches an  empty       If  any call to pcre[16]_exec() in a /g or /G sequence matches an empty
       string,  the  next  call  is  done  with  the PCRE_NOTEMPTY_ATSTART and       string, the next  call  is  done  with  the  PCRE_NOTEMPTY_ATSTART  and
       PCRE_ANCHORED flags set in order  to  search  for  another,  non-empty,       PCRE_ANCHORED  flags  set  in  order  to search for another, non-empty,
       match  at  the same point. If this second match fails, the start offset       match at the same point. If this second match fails, the  start  offset
       is advanced, and the normal match is retried.  This  imitates  the  way       is  advanced,  and  the  normal match is retried. This imitates the way
        Perl handles such cases when using the /g modifier or the split() func-         Perl handles such cases when using the /g modifier or the split() func-
       tion. Normally, the start offset is advanced by one character,  but  if       tion.  Normally,  the start offset is advanced by one character, but if
       the  newline  convention  recognizes CRLF as a newline, and the current       the newline convention recognizes CRLF as a newline,  and  the  current
        character is CR followed by LF, an advance of two is used.         character is CR followed by LF, an advance of two is used.
   
    Other modifiers     Other modifiers
   
        There are yet more modifiers for controlling the way pcretest operates.         There are yet more modifiers for controlling the way pcretest operates.
   
       The /+ modifier requests that as well as outputting the substring  that       The  /+ modifier requests that as well as outputting the substring that
       matched  the  entire  pattern,  pcretest  should in addition output the       matched the entire pattern, pcretest  should  in  addition  output  the
       remainder of the subject string. This is useful  for  tests  where  the       remainder  of  the  subject  string. This is useful for tests where the
       subject  contains multiple copies of the same substring. If the + modi-       subject contains multiple copies of the same substring. If the +  modi-
       fier appears twice, the same action is taken for  captured  substrings.       fier  appears  twice, the same action is taken for captured substrings.
       In  each case the remainder is output on the following line with a plus       In each case the remainder is output on the following line with a  plus
       character following the capture number. Note that  this  modifier  must       character  following  the  capture number. Note that this modifier must
       not immediately follow the /S modifier because /S+ has another meaning.       not immediately follow the /S modifier because /S+ and /S++ have  other
        meanings.
   
        The  /=  modifier  requests  that  the values of all potential captured         The  /=  modifier  requests  that  the values of all potential captured
        parentheses be output after a match. By default, only those up  to  the         parentheses be output after a match. By default, only those up  to  the
Line 368  PATTERN MODIFIERS Line 383  PATTERN MODIFIERS
        different when the pattern is studied.         different when the pattern is studied.
   
        If the /S modifier is immediately followed by a + character,  the  call         If the /S modifier is immediately followed by a + character,  the  call
       to  pcre[16]_study()  is  made  with the PCRE_STUDY_JIT_COMPILE option,       to  pcre[16]_study() is made with all the JIT study options, requesting
       requesting just-in-time optimization support if it is  available.  Note       just-in-time optimization support if it is available, for  both  normal
       that  there  is  also  a  /+ modifier; it must not be given immediately       and  partial matching. If you want to restrict the JIT compiling modes,
       after /S because this will be misinterpreted. If JIT studying  is  suc-       you can follow /S+ with a digit in the range 1 to 7:
       cessful,  it  will  automatically  be used when pcre[16]_exec() is run, 
       except when incompatible run-time options are specified. These  include 
       the  partial  matching options; a complete list is given in the pcrejit 
       documentation. See also the \J escape sequence below for a way of  set- 
       ting the size of the JIT stack. 
   
            1  normal match only
            2  soft partial match only
            3  normal match and soft partial match
            4  hard partial match only
            6  soft and hard partial match
            7  all three modes (default)
   
          If /S++ is used instead of /S+ (with or without a following digit), the
          text  "(JIT)"  is  added  to  the first output line after a match or no
          match when JIT-compiled code was actually used.
   
          Note that there is also an independent /+  modifier;  it  must  not  be
          given immediately after /S or /S+ because this will be misinterpreted.
   
          If JIT studying is successful, the compiled JIT code will automatically
          be used when pcre[16]_exec() is run, except when incompatible  run-time
          options are specified. For more details, see the pcrejit documentation.
          See also the \J escape sequence below for a way of setting the size  of
          the JIT stack.
   
        The  /T  modifier  must be followed by a single digit. It causes a spe-         The  /T  modifier  must be followed by a single digit. It causes a spe-
        cific set of built-in character tables to be  passed  to  pcre[16]_com-         cific set of built-in character tables to be  passed  to  pcre[16]_com-
        pile().  It  is used in the standard PCRE tests to check behaviour with         pile().  It  is used in the standard PCRE tests to check behaviour with
Line 869  AUTHOR Line 899  AUTHOR
   
 REVISION  REVISION
   
       Last updated: 14 January 2012       Last updated: 21 February 2012
        Copyright (c) 1997-2012 University of Cambridge.         Copyright (c) 1997-2012 University of Cambridge.

Removed from v.1.1.1.2  
changed lines
  Added in v.1.1.1.3


FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>