embedaddon/pcre/doc/pcregrep.txt - annotate

Return to pcregrep.txt CVS log
Up to [ELWIX - Embedded LightWeight unIX -] / embedaddon / pcre / doc
Annotation of embedaddon/pcre/doc/pcregrep.txt, revision 1.1.1.2

1.1       misho       1: PCREGREP(1)                                                        PCREGREP(1)
                      2: 
                      3: 
                      4: NAME
                      5:        pcregrep - a grep with Perl-compatible regular expressions.
                      6: 
                      7: 
                      8: SYNOPSIS
                      9:        pcregrep [options] [long options] [pattern] [path1 path2 ...]
                     10: 
                     11: 
                     12: DESCRIPTION
                     13: 
                     14:        pcregrep  searches  files  for  character  patterns, in the same way as
                     15:        other grep commands do, but it uses the PCRE regular expression library
                     16:        to support patterns that are compatible with the regular expressions of
                     17:        Perl 5. See pcrepattern(3) for a full description of syntax and  seman-
                     18:        tics of the regular expressions that PCRE supports.
                     19: 
                     20:        Patterns,  whether  supplied on the command line or in a separate file,
                     21:        are given without delimiters. For example:
                     22: 
                     23:          pcregrep Thursday /etc/motd
                     24: 
                     25:        If you attempt to use delimiters (for example, by surrounding a pattern
                     26:        with  slashes,  as  is common in Perl scripts), they are interpreted as
                     27:        part of the pattern. Quotes can of course be used to  delimit  patterns
                     28:        on  the  command  line  because  they are interpreted by the shell, and
                     29:        indeed they are required if a pattern contains  white  space  or  shell
                     30:        metacharacters.
                     31: 
                     32:        The  first  argument that follows any option settings is treated as the
                     33:        single pattern to be matched when neither -e nor -f is  present.   Con-
                     34:        versely,  when  one  or  both of these options are used to specify pat-
                     35:        terns, all arguments are treated as path names. At least one of -e, -f,
                     36:        or an argument pattern must be provided.
                     37: 
                     38:        If no files are specified, pcregrep reads the standard input. The stan-
                     39:        dard input can also be referenced by a  name  consisting  of  a  single
                     40:        hyphen.  For example:
                     41: 
                     42:          pcregrep some-pattern /file1 - /file3
                     43: 
                     44:        By  default, each line that matches a pattern is copied to the standard
                     45:        output, and if there is more than one file, the file name is output  at
                     46:        the start of each line, followed by a colon. However, there are options
                     47:        that can change how pcregrep behaves.  In  particular,  the  -M  option
                     48:        makes  it  possible  to  search for patterns that span line boundaries.
                     49:        What defines a line  boundary  is  controlled  by  the  -N  (--newline)
                     50:        option.
                     51: 
                     52:        The amount of memory used for buffering files that are being scanned is
                     53:        controlled by a parameter that can be set by the --buffer-size  option.
                     54:        The  default  value  for  this  parameter is specified when pcregrep is
                     55:        built, with the default default being 20K.  A  block  of  memory  three
                     56:        times  this  size  is used (to allow for buffering "before" and "after"
                     57:        lines). An error occurs if a line overflows the buffer.
                     58: 
                     59:        Patterns are limited to 8K or BUFSIZ bytes, whichever is  the  greater.
                     60:        BUFSIZ  is  defined  in  <stdio.h>. When there is more than one pattern
                     61:        (specified by the use of -e and/or -f), each pattern is applied to each
                     62:        line  in  the  order  in which they are defined, except that all the -e
                     63:        patterns are tried before the -f patterns.
                     64: 
                     65:        By default, as soon as one pattern matches (or fails to match  when  -v
                     66:        is  used), no further patterns are considered. However, if --colour (or
                     67:        --color) is used to colour the matching substrings, or if --only-match-
                     68:        ing,  --file-offsets, or --line-offsets is used to output only the part
                     69:        of the line that matched (either shown literally,  or  as  an  offset),
                     70:        scanning  resumes  immediately  following  the  match,  so that further
                     71:        matches on the same line can be found. If there are multiple  patterns,
                     72:        they are all tried on the remainder of the line, but patterns that fol-
                     73:        low the one that matched are not tried on the earlier part of the line.
                     74: 
                     75:        This is the same behaviour as GNU grep, but it does mean that the order
                     76:        in which multiple patterns are specified can affect the output when one
                     77:        of the above options is used.
                     78: 
                     79:        Patterns that can match an empty string are accepted, but empty  string
                     80:        matches   are   never   recognized.   An   example   is   the   pattern
                     81:        "(super)?(man)?", in which all components are  optional.  This  pattern
                     82:        finds  all  occurrences  of  both "super" and "man"; the output differs
                     83:        from matching with "super|man" when only the  matching  substrings  are
                     84:        being shown.
                     85: 
                     86:        If  the  LC_ALL  or LC_CTYPE environment variable is set, pcregrep uses
                     87:        the value to set a locale when calling the PCRE library.  The  --locale
                     88:        option can be used to override this.
                     89: 
                     90: 
                     91: SUPPORT FOR COMPRESSED FILES
                     92: 
                     93:        It  is  possible  to compile pcregrep so that it uses libz or libbz2 to
                     94:        read files whose names end in .gz or .bz2, respectively. You  can  find
                     95:        out whether your binary has support for one or both of these file types
                     96:        by running it with the --help option. If the appropriate support is not
                     97:        present,  files are treated as plain text. The standard input is always
                     98:        so treated.
                     99: 
                    100: 
1.1.1.2 ! misho     101: BINARY FILES
        !           102: 
        !           103:        By default, a file that contains a binary zero byte  within  the  first
        !           104:        1024  bytes is identified as a binary file, and is processed specially.
        !           105:        (GNU grep also  identifies  binary  files  in  this  manner.)  See  the
        !           106:        --binary-files  option for a means of changing the way binary files are
        !           107:        handled.
        !           108: 
        !           109: 
1.1       misho     110: OPTIONS
                    111: 
                    112:        The order in which some of the options appear can  affect  the  output.
                    113:        For  example,  both  the  -h and -l options affect the printing of file
                    114:        names. Whichever comes later in the command line will be the  one  that
                    115:        takes  effect.  Numerical values for options may be followed by K or M,
                    116:        to signify multiplication by 1024 or 1024*1024 respectively.
                    117: 
                    118:        --        This terminates the list of options. It is useful if the next
                    119:                  item  on  the command line starts with a hyphen but is not an
                    120:                  option. This allows for the processing of patterns and  file-
                    121:                  names that start with hyphens.
                    122: 
                    123:        -A number, --after-context=number
                    124:                  Output  number  lines of context after each matching line. If
                    125:                  filenames and/or line numbers are being output, a hyphen sep-
                    126:                  arator  is  used  instead of a colon for the context lines. A
                    127:                  line containing "--" is output between each group  of  lines,
                    128:                  unless  they  are  in  fact contiguous in the input file. The
                    129:                  value of number is expected to be relatively small.  However,
                    130:                  pcregrep guarantees to have up to 8K of following text avail-
                    131:                  able for context output.
                    132: 
1.1.1.2 ! misho     133:        -a, --text
        !           134:                  Treat binary files as text. This is equivalent  to  --binary-
        !           135:                  files=text.
        !           136: 
1.1       misho     137:        -B number, --before-context=number
1.1.1.2 ! misho     138:                  Output  number lines of context before each matching line. If
1.1       misho     139:                  filenames and/or line numbers are being output, a hyphen sep-
1.1.1.2 ! misho     140:                  arator  is  used  instead of a colon for the context lines. A
        !           141:                  line containing "--" is output between each group  of  lines,
        !           142:                  unless  they  are  in  fact contiguous in the input file. The
        !           143:                  value of number is expected to be relatively small.  However,
1.1       misho     144:                  pcregrep guarantees to have up to 8K of preceding text avail-
                    145:                  able for context output.
                    146: 
1.1.1.2 ! misho     147:        --binary-files=word
        !           148:                  Specify how binary files are to be processed. If the word  is
        !           149:                  "binary"  (the  default),  pattern  matching  is performed on
        !           150:                  binary files, but the only  output  is  "Binary  file  <name>
        !           151:                  matches"  when a match succeeds. If the word is "text", which
        !           152:                  is equivalent to the -a or --text option,  binary  files  are
        !           153:                  processed  in  the  same way as any other file. In this case,
        !           154:                  when a match succeeds, the  output  may  be  binary  garbage,
        !           155:                  which  can  have  nasty effects if sent to a terminal. If the
        !           156:                  word is  "without-match",  which  is  equivalent  to  the  -I
        !           157:                  option,  binary  files  are  not  processed  at all; they are
        !           158:                  assumed not to be of interest.
        !           159: 
1.1       misho     160:        --buffer-size=number
1.1.1.2 ! misho     161:                  Set the parameter that controls how much memory is  used  for
1.1       misho     162:                  buffering files that are being scanned.
                    163: 
                    164:        -C number, --context=number
1.1.1.2 ! misho     165:                  Output  number  lines  of  context both before and after each
        !           166:                  matching line.  This is equivalent to setting both -A and  -B
1.1       misho     167:                  to the same value.
                    168: 
                    169:        -c, --count
1.1.1.2 ! misho     170:                  Do  not output individual lines from the files that are being
1.1       misho     171:                  scanned; instead output the number of lines that would other-
1.1.1.2 ! misho     172:                  wise  have  been  shown. If no lines are selected, the number
        !           173:                  zero is output. If several files are  are  being  scanned,  a
        !           174:                  count  is  output  for each of them. However, if the --files-
        !           175:                  with-matches option is also  used,  only  those  files  whose
1.1       misho     176:                  counts are greater than zero are listed. When -c is used, the
                    177:                  -A, -B, and -C options are ignored.
                    178: 
                    179:        --colour, --color
                    180:                  If this option is given without any data, it is equivalent to
1.1.1.2 ! misho     181:                  "--colour=auto".   If  data  is required, it must be given in
1.1       misho     182:                  the same shell item, separated by an equals sign.
                    183: 
                    184:        --colour=value, --color=value
                    185:                  This option specifies under what circumstances the parts of a
                    186:                  line that matched a pattern should be coloured in the output.
1.1.1.2 ! misho     187:                  By default, the output is not coloured. The value  (which  is
        !           188:                  optional,  see above) may be "never", "always", or "auto". In
        !           189:                  the latter case, colouring happens only if the standard  out-
        !           190:                  put  is connected to a terminal. More resources are used when
        !           191:                  colouring is enabled, because pcregrep has to search for  all
        !           192:                  possible  matches in a line, not just one, in order to colour
1.1       misho     193:                  them all.
                    194: 
                    195:                  The colour that is used can be specified by setting the envi-
                    196:                  ronment variable PCREGREP_COLOUR or PCREGREP_COLOR. The value
                    197:                  of this variable should be a string of two numbers, separated
1.1.1.2 ! misho     198:                  by  a  semicolon.  They  are copied directly into the control
        !           199:                  string for setting colour  on  a  terminal,  so  it  is  your
        !           200:                  responsibility  to ensure that they make sense. If neither of
        !           201:                  the environment variables is  set,  the  default  is  "1;31",
1.1       misho     202:                  which gives red.
                    203: 
                    204:        -D action, --devices=action
1.1.1.2 ! misho     205:                  If  an  input  path  is  not  a  regular file or a directory,
        !           206:                  "action" specifies how it is to be  processed.  Valid  values
1.1       misho     207:                  are "read" (the default) or "skip" (silently skip the path).
                    208: 
                    209:        -d action, --directories=action
                    210:                  If an input path is a directory, "action" specifies how it is
1.1.1.2 ! misho     211:                  to be processed.  Valid  values  are  "read"  (the  default),
        !           212:                  "recurse"  (equivalent to the -r option), or "skip" (silently
        !           213:                  skip the path). In the default case, directories are read  as
        !           214:                  if  they  were  ordinary files. In some operating systems the
        !           215:                  effect of reading a directory like this is an immediate  end-
1.1       misho     216:                  of-file.
                    217: 
                    218:        -e pattern, --regex=pattern, --regexp=pattern
                    219:                  Specify a pattern to be matched. This option can be used mul-
                    220:                  tiple times in order to specify several patterns. It can also
1.1.1.2 ! misho     221:                  be  used  as a way of specifying a single pattern that starts
        !           222:                  with a hyphen. When -e is used, no argument pattern is  taken
        !           223:                  from  the  command  line;  all  arguments are treated as file
        !           224:                  names. There is an overall maximum of 100 patterns. They  are
        !           225:                  applied  to  each line in the order in which they are defined
1.1       misho     226:                  until one matches (or fails to match if -v is used). If -f is
1.1.1.2 ! misho     227:                  used  with  -e,  the command line patterns are matched first,
        !           228:                  followed by the patterns from the file,  independent  of  the
        !           229:                  order  in which these options are specified. Note that multi-
1.1       misho     230:                  ple use of -e is not the same as a single pattern with alter-
                    231:                  natives. For example, X|Y finds the first character in a line
1.1.1.2 ! misho     232:                  that is X or Y, whereas if the two patterns are  given  sepa-
1.1       misho     233:                  rately, pcregrep finds X if it is present, even if it follows
1.1.1.2 ! misho     234:                  Y in the line. It finds Y only if there is no X in the  line.
        !           235:                  This  really  matters  only  if  you are using -o to show the
1.1       misho     236:                  part(s) of the line that matched.
                    237: 
                    238:        --exclude=pattern
                    239:                  When pcregrep is searching the files in a directory as a con-
1.1.1.2 ! misho     240:                  sequence  of  the  -r  (recursive search) option, any regular
1.1       misho     241:                  files whose names match the pattern are excluded. Subdirecto-
1.1.1.2 ! misho     242:                  ries  are  not  excluded  by  this  option; they are searched
        !           243:                  recursively, subject to the --exclude-dir  and  --include_dir
        !           244:                  options.  The  pattern  is  a PCRE regular expression, and is
1.1       misho     245:                  matched against the final component of the file name (not the
1.1.1.2 ! misho     246:                  entire  path).  If  a  file  name  matches both --include and
        !           247:                  --exclude, it is excluded.  There is no short form  for  this
1.1       misho     248:                  option.
                    249: 
                    250:        --exclude-dir=pattern
1.1.1.2 ! misho     251:                  When  pcregrep  is searching the contents of a directory as a
        !           252:                  consequence of the -r (recursive search) option,  any  subdi-
        !           253:                  rectories  whose  names match the pattern are excluded. (Note
        !           254:                  that the --exclude option does  not  affect  subdirectories.)
        !           255:                  The  pattern  is  a  PCRE  regular expression, and is matched
        !           256:                  against the final component  of  the  name  (not  the  entire
        !           257:                  path).  If a subdirectory name matches both --include-dir and
        !           258:                  --exclude-dir, it is excluded. There is  no  short  form  for
1.1       misho     259:                  this option.
                    260: 
                    261:        -F, --fixed-strings
1.1.1.2 ! misho     262:                  Interpret  each pattern as a list of fixed strings, separated
        !           263:                  by newlines, instead of  as  a  regular  expression.  The  -w
        !           264:                  (match  as  a  word) and -x (match whole line) options can be
1.1       misho     265:                  used with -F. They apply to each of the fixed strings. A line
                    266:                  is selected if any of the fixed strings are found in it (sub-
                    267:                  ject to -w or -x, if present).
                    268: 
                    269:        -f filename, --file=filename
1.1.1.2 ! misho     270:                  Read a number of patterns from the file, one  per  line,  and
        !           271:                  match  them against each line of input. A data line is output
1.1       misho     272:                  if any of the patterns match it. The filename can be given as
                    273:                  "-" to refer to the standard input. When -f is used, patterns
1.1.1.2 ! misho     274:                  specified on the command line using -e may also  be  present;
1.1       misho     275:                  they are tested before the file's patterns. However, no other
1.1.1.2 ! misho     276:                  pattern is taken from the command  line;  all  arguments  are
        !           277:                  treated  as  the  names  of paths to be searched. There is an
        !           278:                  overall maximum of 100  patterns.  Trailing  white  space  is
        !           279:                  removed from each line, and blank lines are ignored. An empty
        !           280:                  file contains no patterns and therefore matches nothing.  See
        !           281:                  also  the  comments  about  multiple patterns versus a single
        !           282:                  pattern with alternatives in the description of -e above.
        !           283: 
        !           284:        --file-list=filename
        !           285:                  Read a list of files to be searched from the given file,  one
        !           286:                  per line. Trailing white space is removed from each line, and
        !           287:                  blank lines are ignored. These files are searched before  any
        !           288:                  others  that  may be listed on the command line. The filename
        !           289:                  can be given as "-" to refer to the standard input. If --file
        !           290:                  and  --file-list are both specified as "-", patterns are read
        !           291:                  first. This is useful only when the standard input is a  ter-
        !           292:                  minal,  from  which  further lines (the list of files) can be
        !           293:                  read after an end-of-file indication.
1.1       misho     294: 
                    295:        --file-offsets
1.1.1.2 ! misho     296:                  Instead of showing lines or parts of lines that  match,  show
        !           297:                  each  match  as  an  offset  from the start of the file and a
        !           298:                  length, separated by a comma. In this  mode,  no  context  is
        !           299:                  shown.  That  is,  the -A, -B, and -C options are ignored. If
1.1       misho     300:                  there is more than one match in a line, each of them is shown
1.1.1.2 ! misho     301:                  separately.  This  option  is mutually exclusive with --line-
1.1       misho     302:                  offsets and --only-matching.
                    303: 
                    304:        -H, --with-filename
1.1.1.2 ! misho     305:                  Force the inclusion of the filename at the  start  of  output
        !           306:                  lines  when searching a single file. By default, the filename
        !           307:                  is not shown in this case. For matching lines,  the  filename
1.1       misho     308:                  is followed by a colon; for context lines, a hyphen separator
1.1.1.2 ! misho     309:                  is used. If a line number is also being  output,  it  follows
1.1       misho     310:                  the file name.
                    311: 
                    312:        -h, --no-filename
1.1.1.2 ! misho     313:                  Suppress  the output filenames when searching multiple files.
        !           314:                  By default, filenames  are  shown  when  multiple  files  are
        !           315:                  searched.  For  matching lines, the filename is followed by a
        !           316:                  colon; for context lines, a hyphen separator is used.   If  a
1.1       misho     317:                  line number is also being output, it follows the file name.
                    318: 
1.1.1.2 ! misho     319:        --help    Output  a  help  message, giving brief details of the command
1.1       misho     320:                  options and file type support, and then exit.
                    321: 
1.1.1.2 ! misho     322:        -I        Treat binary files as never matching. This is  equivalent  to
        !           323:                  --binary-files=without-match.
        !           324: 
1.1       misho     325:        -i, --ignore-case
                    326:                  Ignore upper/lower case distinctions during comparisons.
                    327: 
                    328:        --include=pattern
                    329:                  When pcregrep is searching the files in a directory as a con-
                    330:                  sequence of the -r (recursive search) option, only those reg-
                    331:                  ular files whose names match the pattern are included. Subdi-
                    332:                  rectories are always included and searched recursively,  sub-
                    333:                  ject to the --include-dir and --exclude-dir options. The pat-
                    334:                  tern is a PCRE regular expression, and is matched against the
                    335:                  final  component of the file name (not the entire path). If a
                    336:                  file  name  matches  both  --include  and  --exclude,  it  is
                    337:                  excluded. There is no short form for this option.
                    338: 
                    339:        --include-dir=pattern
                    340:                  When  pcregrep  is searching the contents of a directory as a
                    341:                  consequence of the -r (recursive search) option,  only  those
                    342:                  subdirectories  whose  names  match the pattern are included.
                    343:                  (Note that the --include option does not  affect  subdirecto-
                    344:                  ries.)  The  pattern  is  a  PCRE  regular expression, and is
                    345:                  matched against the final component  of  the  name  (not  the
                    346:                  entire  path). If a subdirectory name matches both --include-
                    347:                  dir and --exclude-dir, it is excluded. There is no short form
                    348:                  for this option.
                    349: 
                    350:        -L, --files-without-match
                    351:                  Instead  of  outputting lines from the files, just output the
                    352:                  names of the files that do not contain any lines  that  would
                    353:                  have  been  output. Each file name is output once, on a sepa-
                    354:                  rate line.
                    355: 
                    356:        -l, --files-with-matches
                    357:                  Instead of outputting lines from the files, just  output  the
                    358:                  names of the files containing lines that would have been out-
                    359:                  put. Each file name is  output  once,  on  a  separate  line.
                    360:                  Searching  normally stops as soon as a matching line is found
                    361:                  in a file. However, if the -c (count) option  is  also  used,
                    362:                  matching  continues in order to obtain the correct count, and
                    363:                  those files that have at least one  match  are  listed  along
                    364:                  with their counts. Using this option with -c is a way of sup-
                    365:                  pressing the listing of files with no matches.
                    366: 
                    367:        --label=name
                    368:                  This option supplies a name to be used for the standard input
                    369:                  when file names are being output. If not supplied, "(standard
                    370:                  input)" is used. There is no short form for this option.
                    371: 
                    372:        --line-buffered
                    373:                  When this option is given, input is read and  processed  line
                    374:                  by  line,  and  the  output  is  flushed after each write. By
                    375:                  default, input is read in large chunks, unless  pcregrep  can
                    376:                  determine  that  it is reading from a terminal (which is cur-
                    377:                  rently possible only in Unix environments). Output to  termi-
                    378:                  nal  is  normally automatically flushed by the operating sys-
                    379:                  tem. This option can be useful when the input  or  output  is
                    380:                  attached  to a pipe and you do not want pcregrep to buffer up
                    381:                  large amounts of data. However, its use will  affect  perfor-
                    382:                  mance, and the -M (multiline) option ceases to work.
                    383: 
                    384:        --line-offsets
                    385:                  Instead  of  showing lines or parts of lines that match, show
                    386:                  each match as a line number, the offset from the start of the
                    387:                  line,  and a length. The line number is terminated by a colon
                    388:                  (as usual; see the -n option), and the offset and length  are
                    389:                  separated  by  a  comma.  In  this mode, no context is shown.
                    390:                  That is, the -A, -B, and -C options are ignored. If there  is
                    391:                  more  than  one  match in a line, each of them is shown sepa-
                    392:                  rately. This option is mutually exclusive with --file-offsets
                    393:                  and --only-matching.
                    394: 
                    395:        --locale=locale-name
                    396:                  This  option specifies a locale to be used for pattern match-
                    397:                  ing. It overrides the value in the LC_ALL or  LC_CTYPE  envi-
                    398:                  ronment  variables.  If  no  locale  is  specified,  the PCRE
                    399:                  library's default (usually the "C" locale) is used. There  is
                    400:                  no short form for this option.
                    401: 
                    402:        --match-limit=number
                    403:                  Processing  some  regular  expression  patterns can require a
                    404:                  very large amount of memory, leading in some cases to a  pro-
                    405:                  gram  crash  if  not enough is available.  Other patterns may
                    406:                  take a very long time to search  for  all  possible  matching
                    407:                  strings.  The pcre_exec() function that is called by pcregrep
                    408:                  to do the matching has two  parameters  that  can  limit  the
                    409:                  resources that it uses.
                    410: 
                    411:                  The   --match-limit  option  provides  a  means  of  limiting
                    412:                  resource usage when processing patterns that are not going to
                    413:                  match, but which have a very large number of possibilities in
                    414:                  their search trees. The classic example  is  a  pattern  that
                    415:                  uses  nested unlimited repeats. Internally, PCRE uses a func-
                    416:                  tion called match()  which  it  calls  repeatedly  (sometimes
                    417:                  recursively).  The  limit  set by --match-limit is imposed on
                    418:                  the number of times this function is called during  a  match,
                    419:                  which  has  the effect of limiting the amount of backtracking
                    420:                  that can take place.
                    421: 
                    422:                  The --recursion-limit option is similar to --match-limit, but
                    423:                  instead of limiting the total number of times that match() is
                    424:                  called, it limits the depth of recursive calls, which in turn
                    425:                  limits  the  amount of memory that can be used. The recursion
                    426:                  depth is a smaller number than the  total  number  of  calls,
                    427:                  because not all calls to match() are recursive. This limit is
                    428:                  of use only if it is set smaller than --match-limit.
                    429: 
                    430:                  There are no short forms for these options. The default  set-
                    431:                  tings  are  specified when the PCRE library is compiled, with
                    432:                  the default default being 10 million.
                    433: 
                    434:        -M, --multiline
                    435:                  Allow patterns to match more than one line. When this  option
                    436:                  is given, patterns may usefully contain literal newline char-
                    437:                  acters and internal occurrences of ^ and  $  characters.  The
                    438:                  output  for  a  successful match may consist of more than one
                    439:                  line, the last of which is the one in which the match  ended.
                    440:                  If the matched string ends with a newline sequence the output
                    441:                  ends at the end of that line.
                    442: 
                    443:                  When this option is set, the PCRE library is called in  "mul-
                    444:                  tiline"  mode.   There is a limit to the number of lines that
                    445:                  can be matched, imposed by the way that pcregrep buffers  the
                    446:                  input  file as it scans it. However, pcregrep ensures that at
                    447:                  least 8K characters or the rest of the document (whichever is
                    448:                  the  shorter)  are  available for forward matching, and simi-
                    449:                  larly the previous 8K characters (or all the previous charac-
                    450:                  ters,  if  fewer  than 8K) are guaranteed to be available for
                    451:                  lookbehind assertions. This option does not work  when  input
                    452:                  is read line by line (see --line-buffered.)
                    453: 
                    454:        -N newline-type, --newline=newline-type
                    455:                  The  PCRE  library  supports  five  different conventions for
                    456:                  indicating the ends of lines. They are  the  single-character
                    457:                  sequences  CR  (carriage  return) and LF (linefeed), the two-
                    458:                  character sequence CRLF, an "anycrlf" convention, which  rec-
                    459:                  ognizes  any  of the preceding three types, and an "any" con-
                    460:                  vention, in which any Unicode line ending sequence is assumed
                    461:                  to  end a line. The Unicode sequences are the three just men-
                    462:                  tioned, plus  VT  (vertical  tab,  U+000B),  FF  (form  feed,
                    463:                  U+000C),   NEL  (next  line,  U+0085),  LS  (line  separator,
                    464:                  U+2028), and PS (paragraph separator, U+2029).
                    465: 
                    466:                  When  the  PCRE  library  is  built,  a  default  line-ending
                    467:                  sequence   is  specified.   This  is  normally  the  standard
                    468:                  sequence for the operating system. Unless otherwise specified
                    469:                  by  this  option,  pcregrep  uses the library's default.  The
                    470:                  possible values for this option are CR, LF, CRLF, ANYCRLF, or
                    471:                  ANY.  This  makes  it  possible to use pcregrep on files that
                    472:                  have come from other environments without  having  to  modify
                    473:                  their  line  endings.  If the data that is being scanned does
                    474:                  not agree with the convention set by  this  option,  pcregrep
                    475:                  may behave in strange ways.
                    476: 
                    477:        -n, --line-number
                    478:                  Precede each output line by its line number in the file, fol-
                    479:                  lowed by a colon for matching lines or a hyphen  for  context
                    480:                  lines.  If the filename is also being output, it precedes the
                    481:                  line number. This option is forced if --line-offsets is used.
                    482: 
                    483:        --no-jit  If the PCRE library is built with  support  for  just-in-time
                    484:                  compiling  (which speeds up matching), pcregrep automatically
                    485:                  makes use of this, unless it was explicitly disabled at build
                    486:                  time.  This  option  can be used to disable the use of JIT at
                    487:                  run time. It is provided for testing and working round  prob-
                    488:                  lems.  It should never be needed in normal use.
                    489: 
                    490:        -o, --only-matching
                    491:                  Show only the part of the line that matched a pattern instead
                    492:                  of the whole line. In this mode, no context  is  shown.  That
                    493:                  is,  the -A, -B, and -C options are ignored. If there is more
                    494:                  than one match in a line, each of them is  shown  separately.
                    495:                  If  -o  is combined with -v (invert the sense of the match to
                    496:                  find non-matching lines), no output  is  generated,  but  the
                    497:                  return  code  is set appropriately. If the matched portion of
                    498:                  the line is empty, nothing is output unless the file name  or
                    499:                  line  number  are being printed, in which case they are shown
                    500:                  on an otherwise empty line. This option is mutually exclusive
                    501:                  with --file-offsets and --line-offsets.
                    502: 
                    503:        -onumber, --only-matching=number
                    504:                  Show  only  the  part  of the line that matched the capturing
                    505:                  parentheses of the given number. Up to 32 capturing parenthe-
                    506:                  ses are supported. Because these options can be given without
                    507:                  an argument (see above), if an argument is present,  it  must
                    508:                  be  given in the same shell item, for example, -o3 or --only-
                    509:                  matching=2. The comments  given  for  the  non-argument  case
                    510:                  above  also  apply  to  this case. If the specified capturing
                    511:                  parentheses do not exist in the pattern, or were not  set  in
                    512:                  the  match,  nothing  is  output unless the file name or line
                    513:                  number are being printed.
                    514: 
                    515:        -q, --quiet
                    516:                  Work quietly, that is, display nothing except error messages.
                    517:                  The  exit  status  indicates  whether or not any matches were
                    518:                  found.
                    519: 
                    520:        -r, --recursive
                    521:                  If any given path is a directory, recursively scan the  files
                    522:                  it  contains, taking note of any --include and --exclude set-
                    523:                  tings. By default, a directory is read as a normal  file;  in
                    524:                  some  operating  systems this gives an immediate end-of-file.
                    525:                  This option is a shorthand  for  setting  the  -d  option  to
                    526:                  "recurse".
                    527: 
                    528:        --recursion-limit=number
                    529:                  See --match-limit above.
                    530: 
                    531:        -s, --no-messages
                    532:                  Suppress  error  messages  about  non-existent  or unreadable
                    533:                  files. Such files are quietly skipped.  However,  the  return
                    534:                  code is still 2, even if matches were found in other files.
                    535: 
                    536:        -u, --utf-8
                    537:                  Operate  in UTF-8 mode. This option is available only if PCRE
                    538:                  has been compiled with UTF-8 support. Both patterns and  sub-
                    539:                  ject lines must be valid strings of UTF-8 characters.
                    540: 
                    541:        -V, --version
                    542:                  Write  the  version  numbers of pcregrep and the PCRE library
                    543:                  that is being used to the standard error stream.
                    544: 
                    545:        -v, --invert-match
                    546:                  Invert the sense of the match, so that  lines  which  do  not
                    547:                  match any of the patterns are the ones that are found.
                    548: 
                    549:        -w, --word-regex, --word-regexp
                    550:                  Force the patterns to match only whole words. This is equiva-
                    551:                  lent to having \b at the start and end of the pattern.
                    552: 
                    553:        -x, --line-regex, --line-regexp
                    554:                  Force the patterns to be anchored (each must  start  matching
                    555:                  at  the beginning of a line) and in addition, require them to
                    556:                  match entire lines. This is equivalent  to  having  ^  and  $
                    557:                  characters at the start and end of each alternative branch in
                    558:                  every pattern.
                    559: 
                    560: 
                    561: ENVIRONMENT VARIABLES
                    562: 
                    563:        The environment variables LC_ALL and LC_CTYPE  are  examined,  in  that
                    564:        order,  for  a  locale.  The first one that is set is used. This can be
                    565:        overridden by the --locale option.  If  no  locale  is  set,  the  PCRE
                    566:        library's default (usually the "C" locale) is used.
                    567: 
                    568: 
                    569: NEWLINES
                    570: 
                    571:        The  -N (--newline) option allows pcregrep to scan files with different
                    572:        newline conventions from the default.  However,  the  setting  of  this
                    573:        option  does not affect the way in which pcregrep writes information to
                    574:        the standard error and output streams. It uses the  string  "\n"  in  C
                    575:        printf()  calls  to  indicate newlines, relying on the C I/O library to
                    576:        convert this to an appropriate sequence if the  output  is  sent  to  a
                    577:        file.
                    578: 
                    579: 
                    580: OPTIONS COMPATIBILITY
                    581: 
                    582:        Many  of the short and long forms of pcregrep's options are the same as
1.1.1.2 ! misho     583:        in the GNU grep program. Any long option of the form --xxx-regexp  (GNU
        !           584:        terminology)  is also available as --xxx-regex (PCRE terminology). How-
        !           585:        ever, the --file-list, --file-offsets,  --include-dir,  --line-offsets,
        !           586:        --locale,  --match-limit,  -M, --multiline, -N, --newline, --recursion-
        !           587:        limit, -u, and --utf-8 options are specific to pcregrep, as is the  use
        !           588:        of the --only-matching option with a capturing parentheses number.
1.1       misho     589: 
                    590:        Although  most  of the common options work the same way, a few are dif-
                    591:        ferent in pcregrep. For example, the --include option's argument  is  a
                    592:        glob  for  GNU grep, but a regular expression for pcregrep. If both the
                    593:        -c and -l options are given, GNU grep lists only  file  names,  without
                    594:        counts, but pcregrep gives the counts.
                    595: 
                    596: 
                    597: OPTIONS WITH DATA
                    598: 
                    599:        There are four different ways in which an option with data can be spec-
                    600:        ified.  If a short form option is used, the  data  may  follow  immedi-
                    601:        ately, or (with one exception) in the next command line item. For exam-
                    602:        ple:
                    603: 
                    604:          -f/some/file
                    605:          -f /some/file
                    606: 
                    607:        The exception is the -o option, which may appear with or without  data.
                    608:        Because  of this, if data is present, it must follow immediately in the
                    609:        same item, for example -o3.
                    610: 
                    611:        If a long form option is used, the data may appear in the same  command
                    612:        line  item,  separated by an equals character, or (with two exceptions)
                    613:        it may appear in the next command line item. For example:
                    614: 
                    615:          --file=/some/file
                    616:          --file /some/file
                    617: 
                    618:        Note, however, that if you want to supply a file name beginning with  ~
                    619:        as  data  in  a  shell  command,  and have the shell expand ~ to a home
                    620:        directory, you must separate the file name from the option, because the
                    621:        shell does not treat ~ specially unless it is at the start of an item.
                    622: 
                    623:        The  exceptions  to the above are the --colour (or --color) and --only-
                    624:        matching options, for which the data  is  optional.  If  one  of  these
                    625:        options  does  have  data, it must be given in the first form, using an
                    626:        equals character. Otherwise pcregrep will assume that it has no data.
                    627: 
                    628: 
                    629: MATCHING ERRORS
                    630: 
                    631:        It is possible to supply a regular expression that takes  a  very  long
                    632:        time  to  fail  to  match certain lines. Such patterns normally involve
                    633:        nested indefinite repeats, for example: (a+)*\d when matched against  a
                    634:        line  of  a's  with  no  final  digit. The PCRE matching function has a
                    635:        resource limit that causes it to abort in these circumstances. If  this
                    636:        happens, pcregrep outputs an error message and the line that caused the
                    637:        problem to the standard error stream. If there are more  than  20  such
                    638:        errors, pcregrep gives up.
                    639: 
                    640:        The  --match-limit  option  of  pcregrep can be used to set the overall
                    641:        resource limit; there is a second option called --recursion-limit  that
                    642:        sets  a limit on the amount of memory (usually stack) that is used (see
                    643:        the discussion of these options above).
                    644: 
                    645: 
                    646: DIAGNOSTICS
                    647: 
                    648:        Exit status is 0 if any matches were found, 1 if no matches were found,
                    649:        and  2  for syntax errors, overlong lines, non-existent or inaccessible
                    650:        files (even if matches were found in other files) or too many  matching
                    651:        errors. Using the -s option to suppress error messages about inaccessi-
                    652:        ble files does not affect the return code.
                    653: 
                    654: 
                    655: SEE ALSO
                    656: 
                    657:        pcrepattern(3), pcretest(1).
                    658: 
                    659: 
                    660: AUTHOR
                    661: 
                    662:        Philip Hazel
                    663:        University Computing Service
                    664:        Cambridge CB2 3QH, England.
                    665: 
                    666: 
                    667: REVISION
                    668: 
1.1.1.2 ! misho     669:        Last updated: 04 March 2012
        !           670:        Copyright (c) 1997-2012 University of Cambridge.
FreeBSD-CVSweb <freebsd-cvsweb@FreeBSD.org>