source: trunk/grep/doc/grep.1@ 3003

Last change on this file since 3003 was 2557, checked in by bird, 19 years ago

grep 2.5.1a

File size: 18.9 KB
Line 
1.\" grep man page
2.if !\n(.g \{\
3. if !\w|\*(lq| \{\
4. ds lq ``
5. if \w'\(lq' .ds lq "\(lq
6. \}
7. if !\w|\*(rq| \{\
8. ds rq ''
9. if \w'\(rq' .ds rq "\(rq
10. \}
11.\}
12.de Id
13.ds Dt \\$4
14..
15.Id $Id: grep.1,v 1.23 2002/01/22 13:20:04 bero Exp $
16.TH GREP 1 \*(Dt "GNU Project"
17.SH NAME
18grep, egrep, fgrep \- print lines matching a pattern
19.SH SYNOPSIS
20.B grep
21.RI [ options ]
22.I PATTERN
23.RI [ FILE .\|.\|.]
24.br
25.B grep
26.RI [ options ]
27.RB [ \-e
28.I PATTERN
29|
30.B \-f
31.IR FILE ]
32.RI [ FILE .\|.\|.]
33.SH DESCRIPTION
34.PP
35.B Grep
36searches the named input
37.IR FILE s
38(or standard input if no files are named, or
39the file name
40.B \-
41is given)
42for lines containing a match to the given
43.IR PATTERN .
44By default,
45.B grep
46prints the matching lines.
47.PP
48In addition, two variant programs
49.B egrep
50and
51.B fgrep
52are available.
53.B Egrep
54is the same as
55.BR "grep\ \-E" .
56.B Fgrep
57is the same as
58.BR "grep\ \-F" .
59.SH OPTIONS
60.TP
61.BI \-A " NUM" "\fR,\fP \-\^\-after-context=" NUM
62Print
63.I NUM
64lines of trailing context after matching lines.
65Places a line containing
66.B \-\^\-
67between contiguous groups of matches.
68.TP
69.BR \-a ", " \-\^\-text
70Process a binary file as if it were text; this is equivalent to the
71.B \-\^\-binary-files=text
72option.
73.TP
74.BI \-B " NUM" "\fR,\fP \-\^\-before-context=" NUM
75Print
76.I NUM
77lines of leading context before matching lines.
78Places a line containing
79.B \-\^\-
80between contiguous groups of matches.
81.TP
82.BI \-C " NUM" "\fR,\fP \-\^\-context=" NUM
83Print
84.I NUM
85lines of output context.
86Places a line containing
87.B \-\^\-
88between contiguous groups of matches.
89.TP
90.BR \-b ", " \-\^\-byte-offset
91Print the byte offset within the input file before
92each line of output.
93.TP
94.BI \-\^\-binary-files= TYPE
95If the first few bytes of a file indicate that the file contains binary
96data, assume that the file is of type
97.IR TYPE .
98By default,
99.I TYPE
100is
101.BR binary ,
102and
103.B grep
104normally outputs either
105a one-line message saying that a binary file matches, or no message if
106there is no match.
107If
108.I TYPE
109is
110.BR without-match ,
111.B grep
112assumes that a binary file does not match; this is equivalent to the
113.B \-I
114option.
115If
116.I TYPE
117is
118.BR text ,
119.B grep
120processes a binary file as if it were text; this is equivalent to the
121.B \-a
122option.
123.I Warning:
124.B "grep \-\^\-binary-files=text"
125might output binary garbage,
126which can have nasty side effects if the output is a terminal and if the
127terminal driver interprets some of it as commands.
128.TP
129.BI \-\^\-colour[=\fIWHEN\fR] ", " \-\^\-color[=\fIWHEN\fR]
130Surround the matching string with the marker find in
131.B GREP_COLOR
132environment variable. WHEN may be `never', `always', or `auto'
133.TP
134.BR \-c ", " \-\^\-count
135Suppress normal output; instead print a count of
136matching lines for each input file.
137With the
138.BR \-v ", " \-\^\-invert-match
139option (see below), count non-matching lines.
140.TP
141.BI \-D " ACTION" "\fR,\fP \-\^\-devices=" ACTION
142If an input file is a device, FIFO or socket, use
143.I ACTION
144to process it. By default,
145.I ACTION
146is
147.BR read ,
148which means that devices are read just as if they were ordinary files.
149If
150.I ACTION
151is
152.BR skip ,
153devices are silently skipped.
154.TP
155.BI \-d " ACTION" "\fR,\fP \-\^\-directories=" ACTION
156If an input file is a directory, use
157.I ACTION
158to process it. By default,
159.I ACTION
160is
161.BR read ,
162which means that directories are read just as if they were ordinary files.
163If
164.I ACTION
165is
166.BR skip ,
167directories are silently skipped.
168If
169.I ACTION
170is
171.BR recurse ,
172.B grep
173reads all files under each directory, recursively;
174this is equivalent to the
175.B \-r
176option.
177.TP
178.BR \-E ", " \-\^\-extended-regexp
179Interpret
180.I PATTERN
181as an extended regular expression (see below).
182.TP
183.BI \-e " PATTERN" "\fR,\fP \-\^\-regexp=" PATTERN
184Use
185.I PATTERN
186as the pattern; useful to protect patterns beginning with
187.BR \- .
188.TP
189.BR \-F ", " \-\^\-fixed-strings
190Interpret
191.I PATTERN
192as a list of fixed strings, separated by newlines,
193any of which is to be matched.
194.BR \-P ", " \-\^\-perl-regexp
195Interpret
196.I PATTERN
197as a Perl regular expression.
198.TP
199.BI \-f " FILE" "\fR,\fP \-\^\-file=" FILE
200Obtain patterns from
201.IR FILE ,
202one per line.
203The empty file contains zero patterns, and therefore matches nothing.
204.TP
205.BR \-G ", " \-\^\-basic-regexp
206Interpret
207.I PATTERN
208as a basic regular expression (see below). This is the default.
209.TP
210.BR \-H ", " \-\^\-with-filename
211Print the filename for each match.
212.TP
213.BR \-h ", " \-\^\-no-filename
214Suppress the prefixing of filenames on output
215when multiple files are searched.
216.TP
217.B \-\^\-help
218Output a brief help message.
219.TP
220.BR \-I
221Process a binary file as if it did not contain matching data; this is
222equivalent to the
223.B \-\^\-binary-files=without-match
224option.
225.TP
226.BR \-i ", " \-\^\-ignore-case
227Ignore case distinctions in both the
228.I PATTERN
229and the input files.
230.TP
231.BR \-L ", " \-\^\-files-without-match
232Suppress normal output; instead print the name
233of each input file from which no output would
234normally have been printed. The scanning will stop
235on the first match.
236.TP
237.BR \-l ", " \-\^\-files-with-matches
238Suppress normal output; instead print
239the name of each input file from which output
240would normally have been printed. The scanning will
241stop on the first match.
242.TP
243.BI \-m " NUM" "\fR,\fP \-\^\-max-count=" NUM
244Stop reading a file after
245.I NUM
246matching lines. If the input is standard input from a regular file,
247and
248.I NUM
249matching lines are output,
250.B grep
251ensures that the standard input is positioned to just after the last
252matching line before exiting, regardless of the presence of trailing
253context lines. This enables a calling process to resume a search.
254When
255.B grep
256stops after
257.I NUM
258matching lines, it outputs any trailing context lines. When the
259.B \-c
260or
261.B \-\^\-count
262option is also used,
263.B grep
264does not output a count greater than
265.IR NUM .
266When the
267.B \-v
268or
269.B \-\^\-invert-match
270option is also used,
271.B grep
272stops after outputting
273.I NUM
274non-matching lines.
275.TP
276.B \-\^\-mmap
277If possible, use the
278.BR mmap (2)
279system call to read input, instead of
280the default
281.BR read (2)
282system call. In some situations,
283.B \-\^\-mmap
284yields better performance. However,
285.B \-\^\-mmap
286can cause undefined behavior (including core dumps)
287if an input file shrinks while
288.B grep
289is operating, or if an I/O error occurs.
290.TP
291.BR \-n ", " \-\^\-line-number
292Prefix each line of output with the line number
293within its input file.
294.TP
295.BR \-o ", " \-\^\-only-matching
296Show only the part of a matching line that matches
297.I PATTERN.
298.TP
299.BI \-\^\-label= LABEL
300Displays input actually coming from standard input as input coming from file
301.I LABEL.
302This is especially useful for tools like zgrep, e.g.
303.B "gzip -cd foo.gz |grep --label=foo something"
304.TP
305.BR \-\^\-line-buffering
306Use line buffering, it can be a performance penality.
307.TP
308.BR \-q ", " \-\^\-quiet ", " \-\^\-silent
309Quiet; do not write anything to standard output.
310Exit immediately with zero status if any match is found,
311even if an error was detected.
312Also see the
313.B \-s
314or
315.B \-\^\-no-messages
316option.
317.TP
318.BR \-R ", " \-r ", " \-\^\-recursive
319Read all files under each directory, recursively;
320this is equivalent to the
321.B "\-d recurse"
322option.
323.TP
324.BR "\fR \fP \-\^\-include=" PATTERN
325Recurse in directories only searching file matching
326.I PATTERN.
327.TP
328.BR "\fR \fP \-\^\-exclude=" PATTERN
329Recurse in directories skip file matching
330.I PATTERN.
331.TP
332.BR \-s ", " \-\^\-no-messages
333Suppress error messages about nonexistent or unreadable files.
334Portability note: unlike \s-1GNU\s0
335.BR grep ,
336traditional
337.B grep
338did not conform to \s-1POSIX.2\s0, because traditional
339.B grep
340lacked a
341.B \-q
342option and its
343.B \-s
344option behaved like \s-1GNU\s0
345.BR grep 's
346.B \-q
347option.
348Shell scripts intended to be portable to traditional
349.B grep
350should avoid both
351.B \-q
352and
353.B \-s
354and should redirect output to /dev/null instead.
355.TP
356.BR \-U ", " \-\^\-binary
357Treat the file(s) as binary. By default, under MS-DOS and MS-Windows,
358.BR grep
359guesses the file type by looking at the contents of the first 32KB
360read from the file. If
361.BR grep
362decides the file is a text file, it strips the CR characters from the
363original file contents (to make regular expressions with
364.B ^
365and
366.B $
367work correctly). Specifying
368.B \-U
369overrules this guesswork, causing all files to be read and passed to the
370matching mechanism verbatim; if the file is a text file with CR/LF
371pairs at the end of each line, this will cause some regular
372expressions to fail.
373This option has no effect on platforms other than MS-DOS and
374MS-Windows.
375.TP
376.BR \-u ", " \-\^\-unix-byte-offsets
377Report Unix-style byte offsets. This switch causes
378.B grep
379to report byte offsets as if the file were Unix-style text file, i.e. with
380CR characters stripped off. This will produce results identical to running
381.B grep
382on a Unix machine. This option has no effect unless
383.B \-b
384option is also used;
385it has no effect on platforms other than MS-DOS and MS-Windows.
386.TP
387.BR \-V ", " \-\^\-version
388Print the version number of
389.B grep
390to standard error. This version number should
391be included in all bug reports (see below).
392.TP
393.BR \-v ", " \-\^\-invert-match
394Invert the sense of matching, to select non-matching lines.
395.TP
396.BR \-w ", " \-\^\-word-regexp
397Select only those lines containing matches that form whole words.
398The test is that the matching substring must either be at the
399beginning of the line, or preceded by a non-word constituent
400character. Similarly, it must be either at the end of the line
401or followed by a non-word constituent character. Word-constituent
402characters are letters, digits, and the underscore.
403.TP
404.BR \-x ", " \-\^\-line-regexp
405Select only those matches that exactly match the whole line.
406.TP
407.B \-y
408Obsolete synonym for
409.BR \-i .
410.TP
411.BR \-Z ", " \-\^\-null
412Output a zero byte (the \s-1ASCII\s0
413.B NUL
414character) instead of the character that normally follows a file name.
415For example,
416.B "grep \-lZ"
417outputs a zero byte after each file name instead of the usual newline.
418This option makes the output unambiguous, even in the presence of file
419names containing unusual characters like newlines. This option can be
420used with commands like
421.BR "find \-print0" ,
422.BR "perl \-0" ,
423.BR "sort \-z" ,
424and
425.B "xargs \-0"
426to process arbitrary file names,
427even those that contain newline characters.
428.SH "REGULAR EXPRESSIONS"
429.PP
430A regular expression is a pattern that describes a set of strings.
431Regular expressions are constructed analogously to arithmetic
432expressions, by using various operators to combine smaller expressions.
433.PP
434.B Grep
435understands two different versions of regular expression syntax:
436\*(lqbasic\*(rq and \*(lqextended.\*(rq In
437.RB "\s-1GNU\s0\ " grep ,
438there is no difference in available functionality using either syntax.
439In other implementations, basic regular expressions are less powerful.
440The following description applies to extended regular expressions;
441differences for basic regular expressions are summarized afterwards.
442.PP
443The fundamental building blocks are the regular expressions that match
444a single character. Most characters, including all letters and digits,
445are regular expressions that match themselves. Any metacharacter with
446special meaning may be quoted by preceding it with a backslash.
447.PP
448A
449.I "bracket expression"
450is a list of characters enclosed by
451.B [
452and
453.BR ] .
454It matches any single
455character in that list; if the first character of the list
456is the caret
457.B ^
458then it matches any character
459.I not
460in the list.
461For example, the regular expression
462.B [0123456789]
463matches any single digit.
464.PP
465Within a bracket expression, a
466.I "range expression"
467consists of two characters separated by a hyphen.
468It matches any single character that sorts between the two characters,
469inclusive, using the locale's collating sequence and character set.
470For example, in the default C locale,
471.B [a\-d]
472is equivalent to
473.BR [abcd] .
474Many locales sort characters in dictionary order, and in these locales
475.B [a\-d]
476is typically not equivalent to
477.BR [abcd] ;
478it might be equivalent to
479.BR [aBbCcDd] ,
480for example.
481To obtain the traditional interpretation of bracket expressions,
482you can use the C locale by setting the
483.B LC_ALL
484environment variable to the value
485.BR C .
486.PP
487Finally, certain named classes of characters are predefined within
488bracket expressions, as follows.
489Their names are self explanatory, and they are
490.BR [:alnum:] ,
491.BR [:alpha:] ,
492.BR [:cntrl:] ,
493.BR [:digit:] ,
494.BR [:graph:] ,
495.BR [:lower:] ,
496.BR [:print:] ,
497.BR [:punct:] ,
498.BR [:space:] ,
499.BR [:upper:] ,
500and
501.BR [:xdigit:].
502For example,
503.B [[:alnum:]]
504means
505.BR [0\-9A\-Za\-z] ,
506except the latter form depends upon the C locale and the
507\s-1ASCII\s0 character encoding, whereas the former is independent
508of locale and character set.
509(Note that the brackets in these class names are part of the symbolic
510names, and must be included in addition to the brackets delimiting
511the bracket list.) Most metacharacters lose their special meaning
512inside lists. To include a literal
513.B ]
514place it first in the list. Similarly, to include a literal
515.B ^
516place it anywhere but first. Finally, to include a literal
517.B \-
518place it last.
519.PP
520The period
521.B .
522matches any single character.
523The symbol
524.B \ew
525is a synonym for
526.B [[:alnum:]]
527and
528.B \eW
529is a synonym for
530.BR [^[:alnum]] .
531.PP
532The caret
533.B ^
534and the dollar sign
535.B $
536are metacharacters that respectively match the empty string at the
537beginning and end of a line.
538The symbols
539.B \e<
540and
541.B \e>
542respectively match the empty string at the beginning and end of a word.
543The symbol
544.B \eb
545matches the empty string at the edge of a word,
546and
547.B \eB
548matches the empty string provided it's
549.I not
550at the edge of a word.
551.PP
552A regular expression may be followed by one of several repetition operators:
553.PD 0
554.TP
555.B ?
556The preceding item is optional and matched at most once.
557.TP
558.B *
559The preceding item will be matched zero or more times.
560.TP
561.B +
562The preceding item will be matched one or more times.
563.TP
564.BI { n }
565The preceding item is matched exactly
566.I n
567times.
568.TP
569.BI { n ,}
570The preceding item is matched
571.I n
572or more times.
573.TP
574.BI { n , m }
575The preceding item is matched at least
576.I n
577times, but not more than
578.I m
579times.
580.PD
581.PP
582Two regular expressions may be concatenated; the resulting
583regular expression matches any string formed by concatenating
584two substrings that respectively match the concatenated
585subexpressions.
586.PP
587Two regular expressions may be joined by the infix operator
588.BR | ;
589the resulting regular expression matches any string matching
590either subexpression.
591.PP
592Repetition takes precedence over concatenation, which in turn
593takes precedence over alternation. A whole subexpression may be
594enclosed in parentheses to override these precedence rules.
595.PP
596The backreference
597.BI \e n\c
598\&, where
599.I n
600is a single digit, matches the substring
601previously matched by the
602.IR n th
603parenthesized subexpression of the regular expression.
604.PP
605In basic regular expressions the metacharacters
606.BR ? ,
607.BR + ,
608.BR { ,
609.BR | ,
610.BR ( ,
611and
612.BR )
613lose their special meaning; instead use the backslashed
614versions
615.BR \e? ,
616.BR \e+ ,
617.BR \e{ ,
618.BR \e| ,
619.BR \e( ,
620and
621.BR \e) .
622.PP
623Traditional
624.B egrep
625did not support the
626.B {
627metacharacter, and some
628.B egrep
629implementations support
630.B \e{
631instead, so portable scripts should avoid
632.B {
633in
634.B egrep
635patterns and should use
636.B [{]
637to match a literal
638.BR { .
639.PP
640\s-1GNU\s0
641.B egrep
642attempts to support traditional usage by assuming that
643.B {
644is not special if it would be the start of an invalid interval
645specification. For example, the shell command
646.B "egrep '{1'"
647searches for the two-character string
648.B {1
649instead of reporting a syntax error in the regular expression.
650\s-1POSIX.2\s0 allows this behavior as an extension, but portable scripts
651should avoid it.
652.SH "ENVIRONMENT VARIABLES"
653Grep's behavior is affected by the following environment variables.
654.PP
655A locale
656.BI LC_ foo
657is specified by examining the three environment variables
658.BR LC_ALL ,
659.BR LC_\fIfoo\fP ,
660.BR LANG ,
661in that order.
662The first of these variables that is set specifies the locale.
663For example, if
664.B LC_ALL
665is not set, but
666.B LC_MESSAGES
667is set to
668.BR pt_BR ,
669then Brazilian Portuguese is used for the
670.B LC_MESSAGES
671locale.
672The C locale is used if none of these environment variables are set,
673or if the locale catalog is not installed, or if
674.B grep
675was not compiled with national language support (\s-1NLS\s0).
676.TP
677.B GREP_OPTIONS
678This variable specifies default options to be placed in front of any
679explicit options. For example, if
680.B GREP_OPTIONS
681is
682.BR "'\-\^\-binary-files=without-match \-\^\-directories=skip'" ,
683.B grep
684behaves as if the two options
685.B \-\^\-binary-files=without-match
686and
687.B \-\^\-directories=skip
688had been specified before any explicit options.
689Option specifications are separated by whitespace.
690A backslash escapes the next character,
691so it can be used to specify an option containing whitespace or a backslash.
692.TP
693.B GREP_COLOR
694Specifies the marker for highlighting.
695.TP
696\fBLC_ALL\fP, \fBLC_COLLATE\fP, \fBLANG\fP
697These variables specify the
698.B LC_COLLATE
699locale, which determines the collating sequence used to interpret
700range expressions like
701.BR [a\-z] .
702.TP
703\fBLC_ALL\fP, \fBLC_CTYPE\fP, \fBLANG\fP
704These variables specify the
705.B LC_CTYPE
706locale, which determines the type of characters, e.g., which
707characters are whitespace.
708.TP
709\fBLC_ALL\fP, \fBLC_MESSAGES\fP, \fBLANG\fP
710These variables specify the
711.B LC_MESSAGES
712locale, which determines the language that
713.B grep
714uses for messages.
715The default C locale uses American English messages.
716.TP
717.B POSIXLY_CORRECT
718If set,
719.B grep
720behaves as \s-1POSIX.2\s0 requires; otherwise,
721.B grep
722behaves more like other \s-1GNU\s0 programs.
723\s-1POSIX.2\s0 requires that options that follow file names must be
724treated as file names; by default, such options are permuted to the
725front of the operand list and are treated as options.
726Also, \s-1POSIX.2\s0 requires that unrecognized options be diagnosed as
727\*(lqillegal\*(rq, but since they are not really against the law the default
728is to diagnose them as \*(lqinvalid\*(rq.
729.B POSIXLY_CORRECT
730also disables \fB_\fP\fIN\fP\fB_GNU_nonoption_argv_flags_\fP,
731described below.
732.TP
733\fB_\fP\fIN\fP\fB_GNU_nonoption_argv_flags_\fP
734(Here
735.I N
736is
737.BR grep 's
738numeric process ID.) If the
739.IR i th
740character of this environment variable's value is
741.BR 1 ,
742do not consider the
743.IR i th
744operand of
745.B grep
746to be an option, even if it appears to be one.
747A shell can put this variable in the environment for each command it runs,
748specifying which operands are the results of file name wildcard
749expansion and therefore should not be treated as options.
750This behavior is available only with the \s-1GNU\s0 C library, and only
751when
752.B POSIXLY_CORRECT
753is not set.
754.SH DIAGNOSTICS
755.PP
756Normally, exit status is 0 if selected lines are found and 1 otherwise.
757But the exit status is 2 if an error occurred, unless the
758.B \-q
759or
760.B \-\^\-quiet
761or
762.B \-\^\-silent
763option is used and a selected line is found.
764.SH BUGS
765.PP
766Email bug reports to
767.BR bug-grep@gnu.org .
768.PP
769Large repetition counts in the
770.BI { n , m }
771construct may cause grep to use lots of memory.
772In addition,
773certain other obscure regular expressions require exponential time
774and space, and may cause
775.B grep
776to run out of memory.
777.PP
778Backreferences are very slow, and may require exponential time.
779.\" Work around problems with some troff -man implementations.
780.br
Note: See TracBrowser for help on using the repository browser.