Context Navigation

regexprops.texi

Visit:

Last change on this file was 3170, checked in by bird, 18 years ago
findutils 4.3.2
File size: 24.7 KB

Line
1	@menu
2	* findutils-default regular expression syntax::
3	* awk regular expression syntax::
4	* egrep regular expression syntax::
5	* emacs regular expression syntax::
6	* gnu-awk regular expression syntax::
7	* grep regular expression syntax::
8	* posix-awk regular expression syntax::
9	* posix-basic regular expression syntax::
10	* posix-egrep regular expression syntax::
11	* posix-extended regular expression syntax::
12	@end menu
13
14	@node findutils-default regular expression syntax
15	@subsection @samp{findutils-default} regular expression syntax
16
17
18	The character @samp{.} matches any single character.
19
20
21	@table @samp
22
23	@item +
24	indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
25	@item ?
26	indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
27	@item \+
28	matches a @samp{+}
29	@item \?
30	matches a @samp{?}.
31	@end table
32
33
34	Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
35
36	GNU extensions are supported:
37	@enumerate
38
39	@item @samp{\w} matches a character within a word
40
41	@item @samp{\W} matches a character which is not within a word
42
43	@item @samp{\<} matches the beginning of a word
44
45	@item @samp{\>} matches the end of a word
46
47	@item @samp{\b} matches a word boundary
48
49	@item @samp{\B} matches characters which are not a word boundary
50
51	@item @samp{\`} matches the beginning of the whole input
52
53	@item @samp{\'} matches the end of the whole input
54
55	@end enumerate
56
57
58	Grouping is performed with backslashes followed by parentheses @samp{$}, @samp{$}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
59
60	The alternation operator is @samp{\\|}.
61
62	The character @samp{^} only represents the beginning of a string when it appears:
63	@enumerate
64
65	@item
66	At the beginning of a regular expression
67
68	@item After an open-group, signified by
69	@samp{\(}
70
71	@item After the alternation operator @samp{\\|}
72
73	@end enumerate
74
75
76	The character @samp{$} only represents the end of a string when it appears:
77	@enumerate
78
79	@item At the end of a regular expression
80
81	@item Before an close-group, signified by
82	@samp{\)}
83	@item Before the alternation operator @samp{\\|}
84
85	@end enumerate
86
87
88	@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
89	@enumerate
90
91	@item At the beginning of a regular expression
92
93	@item After an open-group, signified by
94	@samp{\(}
95	@item After the alternation operator @samp{\\|}
96
97	@end enumerate
98
99
100
101
102	The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
103
104
105	@node awk regular expression syntax
106	@subsection @samp{awk} regular expression syntax
107
108
109	The character @samp{.} matches any single character except the null character.
110
111
112	@table @samp
113
114	@item +
115	indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
116	@item ?
117	indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
118	@item \+
119	matches a @samp{+}
120	@item \?
121	matches a @samp{?}.
122	@end table
123
124
125	Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
126
127	GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
128
129	Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit matches that digit.
130
131	The alternation operator is @samp{\|}.
132
133	The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
134
135	@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
136	@enumerate
137
138	@item At the beginning of a regular expression
139
140	@item After an open-group, signified by
141	@samp{(}
142	@item After the alternation operator @samp{\|}
143
144	@end enumerate
145
146
147
148
149	The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
150
151
152	@node egrep regular expression syntax
153	@subsection @samp{egrep} regular expression syntax
154
155
156	The character @samp{.} matches any single character except newline.
157
158
159	@table @samp
160
161	@item +
162	indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
163	@item ?
164	indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
165	@item \+
166	matches a @samp{+}
167	@item \?
168	matches a @samp{?}.
169	@end table
170
171
172	Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline.
173
174	GNU extensions are supported:
175	@enumerate
176
177	@item @samp{\w} matches a character within a word
178
179	@item @samp{\W} matches a character which is not within a word
180
181	@item @samp{\<} matches the beginning of a word
182
183	@item @samp{\>} matches the end of a word
184
185	@item @samp{\b} matches a word boundary
186
187	@item @samp{\B} matches characters which are not a word boundary
188
189	@item @samp{\`} matches the beginning of the whole input
190
191	@item @samp{\'} matches the end of the whole input
192
193	@end enumerate
194
195
196	Grouping is performed with parentheses @samp{()}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
197
198	The alternation operator is @samp{\|}.
199
200	The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
201
202	The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression.
203
204
205
206	The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
207
208
209	@node emacs regular expression syntax
210	@subsection @samp{emacs} regular expression syntax
211
212
213	The character @samp{.} matches any single character except newline.
214
215
216	@table @samp
217
218	@item +
219	indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
220	@item ?
221	indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
222	@item \+
223	matches a @samp{+}
224	@item \?
225	matches a @samp{?}.
226	@end table
227
228
229	Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
230
231	GNU extensions are supported:
232	@enumerate
233
234	@item @samp{\w} matches a character within a word
235
236	@item @samp{\W} matches a character which is not within a word
237
238	@item @samp{\<} matches the beginning of a word
239
240	@item @samp{\>} matches the end of a word
241
242	@item @samp{\b} matches a word boundary
243
244	@item @samp{\B} matches characters which are not a word boundary
245
246	@item @samp{\`} matches the beginning of the whole input
247
248	@item @samp{\'} matches the end of the whole input
249
250	@end enumerate
251
252
253	Grouping is performed with backslashes followed by parentheses @samp{$}, @samp{$}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
254
255	The alternation operator is @samp{\\|}.
256
257	The character @samp{^} only represents the beginning of a string when it appears:
258	@enumerate
259
260	@item
261	At the beginning of a regular expression
262
263	@item After an open-group, signified by
264	@samp{\(}
265
266	@item After the alternation operator @samp{\\|}
267
268	@end enumerate
269
270
271	The character @samp{$} only represents the end of a string when it appears:
272	@enumerate
273
274	@item At the end of a regular expression
275
276	@item Before an close-group, signified by
277	@samp{\)}
278	@item Before the alternation operator @samp{\\|}
279
280	@end enumerate
281
282
283	@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
284	@enumerate
285
286	@item At the beginning of a regular expression
287
288	@item After an open-group, signified by
289	@samp{\(}
290	@item After the alternation operator @samp{\\|}
291
292	@end enumerate
293
294
295
296
297	The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
298
299
300	@node gnu-awk regular expression syntax
301	@subsection @samp{gnu-awk} regular expression syntax
302
303
304	The character @samp{.} matches any single character.
305
306
307	@table @samp
308
309	@item +
310	indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
311	@item ?
312	indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
313	@item \+
314	matches a @samp{+}
315	@item \?
316	matches a @samp{?}.
317	@end table
318
319
320	Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
321
322	GNU extensions are supported:
323	@enumerate
324
325	@item @samp{\w} matches a character within a word
326
327	@item @samp{\W} matches a character which is not within a word
328
329	@item @samp{\<} matches the beginning of a word
330
331	@item @samp{\>} matches the end of a word
332
333	@item @samp{\b} matches a word boundary
334
335	@item @samp{\B} matches characters which are not a word boundary
336
337	@item @samp{\`} matches the beginning of the whole input
338
339	@item @samp{\'} matches the end of the whole input
340
341	@end enumerate
342
343
344	Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
345
346	The alternation operator is @samp{\|}.
347
348	The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
349
350	@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
351	@enumerate
352
353	@item At the beginning of a regular expression
354
355	@item After an open-group, signified by
356	@samp{(}
357	@item After the alternation operator @samp{\|}
358
359	@end enumerate
360
361
362
363
364	The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
365
366
367	@node grep regular expression syntax
368	@subsection @samp{grep} regular expression syntax
369
370
371	The character @samp{.} matches any single character except newline.
372
373
374	@table @samp
375
376	@item \+
377	indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
378	@item \?
379	indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
380	@item + and ?
381	match themselves.
382	@end table
383
384
385	Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline.
386
387	GNU extensions are supported:
388	@enumerate
389
390	@item @samp{\w} matches a character within a word
391
392	@item @samp{\W} matches a character which is not within a word
393
394	@item @samp{\<} matches the beginning of a word
395
396	@item @samp{\>} matches the end of a word
397
398	@item @samp{\b} matches a word boundary
399
400	@item @samp{\B} matches characters which are not a word boundary
401
402	@item @samp{\`} matches the beginning of the whole input
403
404	@item @samp{\'} matches the end of the whole input
405
406	@end enumerate
407
408
409	Grouping is performed with backslashes followed by parentheses @samp{$}, @samp{$}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
410
411	The alternation operator is @samp{\\|}.
412
413	The character @samp{^} only represents the beginning of a string when it appears:
414	@enumerate
415
416	@item
417	At the beginning of a regular expression
418
419	@item After an open-group, signified by
420	@samp{\(}
421
422	@item After a newline
423
424	@item After the alternation operator @samp{\\|}
425
426	@end enumerate
427
428
429	The character @samp{$} only represents the end of a string when it appears:
430	@enumerate
431
432	@item At the end of a regular expression
433
434	@item Before an close-group, signified by
435	@samp{\)}
436	@item Before a newline
437
438	@item Before the alternation operator @samp{\\|}
439
440	@end enumerate
441
442
443	@samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
444	@enumerate
445
446	@item At the beginning of a regular expression
447
448	@item After an open-group, signified by
449	@samp{\(}
450	@item After a newline
451
452	@item After the alternation operator @samp{\\|}
453
454	@end enumerate
455
456
457	Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
458
459	The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
460
461
462	@node posix-awk regular expression syntax
463	@subsection @samp{posix-awk} regular expression syntax
464
465
466	The character @samp{.} matches any single character except the null character.
467
468
469	@table @samp
470
471	@item +
472	indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
473	@item ?
474	indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
475	@item \+
476	matches a @samp{+}
477	@item \?
478	matches a @samp{?}.
479	@end table
480
481
482	Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
483
484	GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
485
486	Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
487
488	The alternation operator is @samp{\|}.
489
490	The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
491
492	@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed:
493	@enumerate
494
495	@item At the beginning of a regular expression
496
497	@item After an open-group, signified by
498	@samp{(}
499	@item After the alternation operator @samp{\|}
500
501	@end enumerate
502
503
504	Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals such as @samp{a@{1z} are not accepted.
505
506	The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
507
508
509	@node posix-basic regular expression syntax
510	@subsection @samp{posix-basic} regular expression syntax
511
512
513	The character @samp{.} matches any single character except the null character.
514
515
516	@table @samp
517
518	@item \+
519	indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
520	@item \?
521	indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
522	@item + and ?
523	match themselves.
524	@end table
525
526
527	Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
528
529	GNU extensions are supported:
530	@enumerate
531
532	@item @samp{\w} matches a character within a word
533
534	@item @samp{\W} matches a character which is not within a word
535
536	@item @samp{\<} matches the beginning of a word
537
538	@item @samp{\>} matches the end of a word
539
540	@item @samp{\b} matches a word boundary
541
542	@item @samp{\B} matches characters which are not a word boundary
543
544	@item @samp{\`} matches the beginning of the whole input
545
546	@item @samp{\'} matches the end of the whole input
547
548	@end enumerate
549
550
551	Grouping is performed with backslashes followed by parentheses @samp{$}, @samp{$}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
552
553	The alternation operator is @samp{\\|}.
554
555	The character @samp{^} only represents the beginning of a string when it appears:
556	@enumerate
557
558	@item
559	At the beginning of a regular expression
560
561	@item After an open-group, signified by
562	@samp{\(}
563
564	@item After the alternation operator @samp{\\|}
565
566	@end enumerate
567
568
569	The character @samp{$} only represents the end of a string when it appears:
570	@enumerate
571
572	@item At the end of a regular expression
573
574	@item Before an close-group, signified by
575	@samp{\)}
576	@item Before the alternation operator @samp{\\|}
577
578	@end enumerate
579
580
581	@samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
582	@enumerate
583
584	@item At the beginning of a regular expression
585
586	@item After an open-group, signified by
587	@samp{\(}
588	@item After the alternation operator @samp{\\|}
589
590	@end enumerate
591
592
593	Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
594
595	The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
596
597
598	@node posix-egrep regular expression syntax
599	@subsection @samp{posix-egrep} regular expression syntax
600
601
602	The character @samp{.} matches any single character except newline.
603
604
605	@table @samp
606
607	@item +
608	indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
609	@item ?
610	indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
611	@item \+
612	matches a @samp{+}
613	@item \?
614	matches a @samp{?}.
615	@end table
616
617
618	Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline.
619
620	GNU extensions are supported:
621	@enumerate
622
623	@item @samp{\w} matches a character within a word
624
625	@item @samp{\W} matches a character which is not within a word
626
627	@item @samp{\<} matches the beginning of a word
628
629	@item @samp{\>} matches the end of a word
630
631	@item @samp{\b} matches a word boundary
632
633	@item @samp{\B} matches characters which are not a word boundary
634
635	@item @samp{\`} matches the beginning of the whole input
636
637	@item @samp{\'} matches the end of the whole input
638
639	@end enumerate
640
641
642	Grouping is performed with parentheses @samp{()}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
643
644	The alternation operator is @samp{\|}.
645
646	The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
647
648	The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression.
649
650	Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
651
652	The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
653
654
655	@node posix-extended regular expression syntax
656	@subsection @samp{posix-extended} regular expression syntax
657
658
659	The character @samp{.} matches any single character except the null character.
660
661
662	@table @samp
663
664	@item +
665	indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
666	@item ?
667	indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
668	@item \+
669	matches a @samp{+}
670	@item \?
671	matches a @samp{?}.
672	@end table
673
674
675	Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
676
677	GNU extensions are supported:
678	@enumerate
679
680	@item @samp{\w} matches a character within a word
681
682	@item @samp{\W} matches a character which is not within a word
683
684	@item @samp{\<} matches the beginning of a word
685
686	@item @samp{\>} matches the end of a word
687
688	@item @samp{\b} matches a word boundary
689
690	@item @samp{\B} matches characters which are not a word boundary
691
692	@item @samp{\`} matches the beginning of the whole input
693
694	@item @samp{\'} matches the end of the whole input
695
696	@end enumerate
697
698
699	Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
700
701	The alternation operator is @samp{\|}.
702
703	The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
704
705	@samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed:
706	@enumerate
707
708	@item At the beginning of a regular expression
709
710	@item After an open-group, signified by
711	@samp{(}
712	@item After the alternation operator @samp{\|}
713
714	@end enumerate
715
716
717	Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals such as @samp{a@{1z} are not accepted.
718
719	The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
720

Note: See TracBrowser for help on using the repository browser.

Context Navigation

source: trunk/essentials/sys-apps/findutils/doc/regexprops.texi

Download in other formats: