1 | This is sed.info, produced by makeinfo version 6.8dev from sed.texi.
|
---|
2 |
|
---|
3 | This file documents version 4.9 of GNU âsedâ, a stream editor.
|
---|
4 |
|
---|
5 | Copyright © 1998â2022 Free Software Foundation, Inc.
|
---|
6 |
|
---|
7 | Permission is granted to copy, distribute and/or modify this
|
---|
8 | document under the terms of the GNU Free Documentation License,
|
---|
9 | Version 1.3 or any later version published by the Free Software
|
---|
10 | Foundation; with no Invariant Sections, no Front-Cover Texts, and
|
---|
11 | no Back-Cover Texts. A copy of the license is included in the
|
---|
12 | section entitled âGNU Free Documentation Licenseâ.
|
---|
13 | INFO-DIR-SECTION Text creation and manipulation
|
---|
14 | START-INFO-DIR-ENTRY
|
---|
15 | * sed: (sed). Stream EDitor.
|
---|
16 |
|
---|
17 | END-INFO-DIR-ENTRY
|
---|
18 |
|
---|
19 |
|
---|
20 | File: sed.info, Node: Top, Next: Introduction, Up: (dir)
|
---|
21 |
|
---|
22 | GNU âsedâ
|
---|
23 | *********
|
---|
24 |
|
---|
25 | This file documents version 4.9 of GNU âsedâ, a stream editor.
|
---|
26 |
|
---|
27 | Copyright © 1998â2022 Free Software Foundation, Inc.
|
---|
28 |
|
---|
29 | Permission is granted to copy, distribute and/or modify this
|
---|
30 | document under the terms of the GNU Free Documentation License,
|
---|
31 | Version 1.3 or any later version published by the Free Software
|
---|
32 | Foundation; with no Invariant Sections, no Front-Cover Texts, and
|
---|
33 | no Back-Cover Texts. A copy of the license is included in the
|
---|
34 | section entitled âGNU Free Documentation Licenseâ.
|
---|
35 |
|
---|
36 | * Menu:
|
---|
37 |
|
---|
38 | * Introduction:: Introduction
|
---|
39 | * Invoking sed:: Invocation
|
---|
40 | * sed scripts:: âsedâ scripts
|
---|
41 | * sed addresses:: Addresses: selecting lines
|
---|
42 | * sed regular expressions:: Regular expressions: selecting text
|
---|
43 | * advanced sed:: Advanced âsedâ: cycles and buffers
|
---|
44 | * Examples:: Some sample scripts
|
---|
45 | * Limitations:: Limitations and (non-)limitations of GNU âsedâ
|
---|
46 | * Other Resources:: Other resources for learning about âsedâ
|
---|
47 | * Reporting Bugs:: Reporting bugs
|
---|
48 | * GNU Free Documentation License:: Copying and sharing this manual
|
---|
49 | * Concept Index:: A menu with all the topics in this manual.
|
---|
50 | * Command and Option Index:: A menu with all âsedâ commands and
|
---|
51 | command-line options.
|
---|
52 |
|
---|
53 |
|
---|
54 | File: sed.info, Node: Introduction, Next: Invoking sed, Prev: Top, Up: Top
|
---|
55 |
|
---|
56 | 1 Introduction
|
---|
57 | **************
|
---|
58 |
|
---|
59 | âsedâ is a stream editor. A stream editor is used to perform basic text
|
---|
60 | transformations on an input stream (a file or input from a pipeline).
|
---|
61 | While in some ways similar to an editor which permits scripted edits
|
---|
62 | (such as âedâ), âsedâ works by making only one pass over the input(s),
|
---|
63 | and is consequently more efficient. But it is âsedââs ability to filter
|
---|
64 | text in a pipeline which particularly distinguishes it from other types
|
---|
65 | of editors.
|
---|
66 |
|
---|
67 |
|
---|
68 | File: sed.info, Node: Invoking sed, Next: sed scripts, Prev: Introduction, Up: Top
|
---|
69 |
|
---|
70 | 2 Running sed
|
---|
71 | *************
|
---|
72 |
|
---|
73 | This chapter covers how to run âsedâ. Details of âsedâ scripts and
|
---|
74 | individual âsedâ commands are discussed in the next chapter.
|
---|
75 |
|
---|
76 | * Menu:
|
---|
77 |
|
---|
78 | * Overview::
|
---|
79 | * Command-Line Options::
|
---|
80 | * Exit status::
|
---|
81 |
|
---|
82 |
|
---|
83 | File: sed.info, Node: Overview, Next: Command-Line Options, Up: Invoking sed
|
---|
84 |
|
---|
85 | 2.1 Overview
|
---|
86 | ============
|
---|
87 |
|
---|
88 | Normally âsedâ is invoked like this:
|
---|
89 |
|
---|
90 | sed SCRIPT INPUTFILE...
|
---|
91 |
|
---|
92 | For example, to change every âhelloâ to âworldâ in the file
|
---|
93 | âinput.txtâ:
|
---|
94 |
|
---|
95 | sed 's/hello/world/g' input.txt > output.txt
|
---|
96 |
|
---|
97 | Without the âgâ (global) modifier, âsedâ affects only the first
|
---|
98 | instance per line.
|
---|
99 |
|
---|
100 | If you do not specify INPUTFILE, or if INPUTFILE is â-â, âsedâ
|
---|
101 | filters the contents of the standard input. The following commands are
|
---|
102 | equivalent:
|
---|
103 |
|
---|
104 | sed 's/hello/world/g' input.txt > output.txt
|
---|
105 | sed 's/hello/world/g' < input.txt > output.txt
|
---|
106 | cat input.txt | sed 's/hello/world/g' - > output.txt
|
---|
107 |
|
---|
108 | âsedâ writes output to standard output. Use â-iâ to edit files
|
---|
109 | in-place instead of printing to standard output. See also the âWâ and
|
---|
110 | âs///wâ commands for writing output to other files. The following
|
---|
111 | command modifies âfile.txtâ and does not produce any output:
|
---|
112 |
|
---|
113 | sed -i 's/hello/world/' file.txt
|
---|
114 |
|
---|
115 | By default âsedâ prints all processed input (except input that has
|
---|
116 | been modified/deleted by commands such as âdâ). Use â-nâ to suppress
|
---|
117 | output, and the âpâ command to print specific lines. The following
|
---|
118 | command prints only line 45 of the input file:
|
---|
119 |
|
---|
120 | sed -n '45p' file.txt
|
---|
121 |
|
---|
122 | âsedâ treats multiple input files as one long stream. The following
|
---|
123 | example prints the first line of the first file (âone.txtâ) and the last
|
---|
124 | line of the last file (âthree.txtâ). Use â-sâ to reverse this behavior.
|
---|
125 |
|
---|
126 | sed -n '1p ; $p' one.txt two.txt three.txt
|
---|
127 |
|
---|
128 | Without â-eâ or â-fâ options, âsedâ uses the first non-option
|
---|
129 | parameter as the SCRIPT, and the following non-option parameters as
|
---|
130 | input files. If â-eâ or â-fâ options are used to specify a SCRIPT, all
|
---|
131 | non-option parameters are taken as input files. Options â-eâ and â-fâ
|
---|
132 | can be combined, and can appear multiple times (in which case the final
|
---|
133 | effective SCRIPT will be concatenation of all the individual SCRIPTs).
|
---|
134 |
|
---|
135 | The following examples are equivalent:
|
---|
136 |
|
---|
137 | sed 's/hello/world/' input.txt > output.txt
|
---|
138 |
|
---|
139 | sed -e 's/hello/world/' input.txt > output.txt
|
---|
140 | sed --expression='s/hello/world/' input.txt > output.txt
|
---|
141 |
|
---|
142 | echo 's/hello/world/' > myscript.sed
|
---|
143 | sed -f myscript.sed input.txt > output.txt
|
---|
144 | sed --file=myscript.sed input.txt > output.txt
|
---|
145 |
|
---|
146 |
|
---|
147 | File: sed.info, Node: Command-Line Options, Next: Exit status, Prev: Overview, Up: Invoking sed
|
---|
148 |
|
---|
149 | 2.2 Command-Line Options
|
---|
150 | ========================
|
---|
151 |
|
---|
152 | The full format for invoking âsedâ is:
|
---|
153 |
|
---|
154 | sed OPTIONS... [SCRIPT] [INPUTFILE...]
|
---|
155 |
|
---|
156 | âsedâ may be invoked with the following command-line options:
|
---|
157 |
|
---|
158 | â--versionâ
|
---|
159 | Print out the version of âsedâ that is being run and a copyright
|
---|
160 | notice, then exit.
|
---|
161 |
|
---|
162 | â--helpâ
|
---|
163 | Print a usage message briefly summarizing these command-line
|
---|
164 | options and the bug-reporting address, then exit.
|
---|
165 |
|
---|
166 | â-nâ
|
---|
167 | â--quietâ
|
---|
168 | â--silentâ
|
---|
169 | By default, âsedâ prints out the pattern space at the end of each
|
---|
170 | cycle through the script (*note How âsedâ works: Execution Cycle.).
|
---|
171 | These options disable this automatic printing, and âsedâ only
|
---|
172 | produces output when explicitly told to via the âpâ command.
|
---|
173 |
|
---|
174 | â--debugâ
|
---|
175 | Print the input sed program in canonical form, and annotate program
|
---|
176 | execution.
|
---|
177 | $ echo 1 | sed '\%1%s21232'
|
---|
178 | 3
|
---|
179 |
|
---|
180 | $ echo 1 | sed --debug '\%1%s21232'
|
---|
181 | SED PROGRAM:
|
---|
182 | /1/ s/1/3/
|
---|
183 | INPUT: 'STDIN' line 1
|
---|
184 | PATTERN: 1
|
---|
185 | COMMAND: /1/ s/1/3/
|
---|
186 | PATTERN: 3
|
---|
187 | END-OF-CYCLE:
|
---|
188 | 3
|
---|
189 |
|
---|
190 | â-e SCRIPTâ
|
---|
191 | â--expression=SCRIPTâ
|
---|
192 | Add the commands in SCRIPT to the set of commands to be run while
|
---|
193 | processing the input.
|
---|
194 |
|
---|
195 | â-f SCRIPT-FILEâ
|
---|
196 | â--file=SCRIPT-FILEâ
|
---|
197 | Add the commands contained in the file SCRIPT-FILE to the set of
|
---|
198 | commands to be run while processing the input.
|
---|
199 |
|
---|
200 | â-i[SUFFIX]â
|
---|
201 | â--in-place[=SUFFIX]â
|
---|
202 | This option specifies that files are to be edited in-place. GNU
|
---|
203 | âsedâ does this by creating a temporary file and sending output to
|
---|
204 | this file rather than to the standard output.(1).
|
---|
205 |
|
---|
206 | This option implies â-sâ.
|
---|
207 |
|
---|
208 | When the end of the file is reached, the temporary file is renamed
|
---|
209 | to the output fileâs original name. The extension, if supplied, is
|
---|
210 | used to modify the name of the old file before renaming the
|
---|
211 | temporary file, thereby making a backup copy(2)).
|
---|
212 |
|
---|
213 | This rule is followed: if the extension doesnât contain a â*â, then
|
---|
214 | it is appended to the end of the current filename as a suffix; if
|
---|
215 | the extension does contain one or more â*â characters, then _each_
|
---|
216 | asterisk is replaced with the current filename. This allows you to
|
---|
217 | add a prefix to the backup file, instead of (or in addition to) a
|
---|
218 | suffix, or even to place backup copies of the original files into
|
---|
219 | another directory (provided the directory already exists).
|
---|
220 |
|
---|
221 | If no extension is supplied, the original file is overwritten
|
---|
222 | without making a backup.
|
---|
223 |
|
---|
224 | Because â-iâ takes an optional argument, it should not be followed
|
---|
225 | by other short options:
|
---|
226 | âsed -Ei '...' FILEâ
|
---|
227 | Same as â-E -iâ with no backup suffix - âFILEâ will be edited
|
---|
228 | in-place without creating a backup.
|
---|
229 |
|
---|
230 | âsed -iE '...' FILEâ
|
---|
231 | This is equivalent to â--in-place=Eâ, creating âFILEEâ as
|
---|
232 | backup of âFILEâ
|
---|
233 |
|
---|
234 | Be cautious of using â-nâ with â-iâ: the former disables automatic
|
---|
235 | printing of lines and the latter changes the file in-place without
|
---|
236 | a backup. Used carelessly (and without an explicit âpâ command),
|
---|
237 | the output file will be empty:
|
---|
238 | # WRONG USAGE: 'FILE' will be truncated.
|
---|
239 | sed -ni 's/foo/bar/' FILE
|
---|
240 |
|
---|
241 | â-l Nâ
|
---|
242 | â--line-length=Nâ
|
---|
243 | Specify the default line-wrap length for the âlâ command. A length
|
---|
244 | of 0 (zero) means to never wrap long lines. If not specified, it
|
---|
245 | is taken to be 70.
|
---|
246 |
|
---|
247 | â--posixâ
|
---|
248 | GNU âsedâ includes several extensions to POSIX sed. In order to
|
---|
249 | simplify writing portable scripts, this option disables all the
|
---|
250 | extensions that this manual documents, including additional
|
---|
251 | commands. Most of the extensions accept âsedâ programs that are
|
---|
252 | outside the syntax mandated by POSIX, but some of them (such as the
|
---|
253 | behavior of the âNâ command described in *note Reporting Bugs::)
|
---|
254 | actually violate the standard. If you want to disable only the
|
---|
255 | latter kind of extension, you can set the âPOSIXLY_CORRECTâ
|
---|
256 | variable to a non-empty value.
|
---|
257 |
|
---|
258 | â-bâ
|
---|
259 | â--binaryâ
|
---|
260 | This option is available on every platform, but is only effective
|
---|
261 | where the operating system makes a distinction between text files
|
---|
262 | and binary files. When such a distinction is madeâas is the case
|
---|
263 | for MS-DOS, Windows, Cygwinâtext files are composed of lines
|
---|
264 | separated by a carriage return _and_ a line feed character, and
|
---|
265 | âsedâ does not see the ending CR. When this option is specified,
|
---|
266 | âsedâ will open input files in binary mode, thus not requesting
|
---|
267 | this special processing and considering lines to end at a line
|
---|
268 | feed.
|
---|
269 |
|
---|
270 | â--follow-symlinksâ
|
---|
271 | This option is available only on platforms that support symbolic
|
---|
272 | links and has an effect only if option â-iâ is specified. In this
|
---|
273 | case, if the file that is specified on the command line is a
|
---|
274 | symbolic link, âsedâ will follow the link and edit the ultimate
|
---|
275 | destination of the link. The default behavior is to break the
|
---|
276 | symbolic link, so that the link destination will not be modified.
|
---|
277 |
|
---|
278 | â-Eâ
|
---|
279 | â-râ
|
---|
280 | â--regexp-extendedâ
|
---|
281 | Use extended regular expressions rather than basic regular
|
---|
282 | expressions. Extended regexps are those that âegrepâ accepts; they
|
---|
283 | can be clearer because they usually have fewer backslashes.
|
---|
284 | Historically this was a GNU extension, but the â-Eâ extension has
|
---|
285 | since been added to the POSIX standard
|
---|
286 | (http://austingroupbugs.net/view.php?id=528), so use â-Eâ for
|
---|
287 | portability. GNU sed has accepted â-Eâ as an undocumented option
|
---|
288 | for years, and *BSD seds have accepted â-Eâ for years as well, but
|
---|
289 | scripts that use â-Eâ might not port to other older systems. *Note
|
---|
290 | Extended regular expressions: ERE syntax.
|
---|
291 |
|
---|
292 | â-sâ
|
---|
293 | â--separateâ
|
---|
294 | By default, âsedâ will consider the files specified on the command
|
---|
295 | line as a single continuous long stream. This GNU âsedâ extension
|
---|
296 | allows the user to consider them as separate files: range addresses
|
---|
297 | (such as â/abc/,/def/â) are not allowed to span several files, line
|
---|
298 | numbers are relative to the start of each file, â$â refers to the
|
---|
299 | last line of each file, and files invoked from the âRâ commands are
|
---|
300 | rewound at the start of each file.
|
---|
301 |
|
---|
302 | â--sandboxâ
|
---|
303 | In sandbox mode, âe/w/râ commands are rejected - programs
|
---|
304 | containing them will be aborted without being run. Sandbox mode
|
---|
305 | ensures âsedâ operates only on the input files designated on the
|
---|
306 | command line, and cannot run external programs.
|
---|
307 |
|
---|
308 | â-uâ
|
---|
309 | â--unbufferedâ
|
---|
310 | Buffer both input and output as minimally as practical. (This is
|
---|
311 | particularly useful if the input is coming from the likes of âtail
|
---|
312 | -fâ, and you wish to see the transformed output as soon as
|
---|
313 | possible.)
|
---|
314 |
|
---|
315 | â-zâ
|
---|
316 | â--null-dataâ
|
---|
317 | â--zero-terminatedâ
|
---|
318 | Treat the input as a set of lines, each terminated by a zero byte
|
---|
319 | (the ASCII âNULâ character) instead of a newline. This option can
|
---|
320 | be used with commands like âsort -zâ and âfind -print0â to process
|
---|
321 | arbitrary file names.
|
---|
322 |
|
---|
323 | If no â-eâ, â-fâ, â--expressionâ, or â--fileâ options are given on
|
---|
324 | the command-line, then the first non-option argument on the command line
|
---|
325 | is taken to be the SCRIPT to be executed.
|
---|
326 |
|
---|
327 | If any command-line parameters remain after processing the above,
|
---|
328 | these parameters are interpreted as the names of input files to be
|
---|
329 | processed. A file name of â-â refers to the standard input stream. The
|
---|
330 | standard input will be processed if no file names are specified.
|
---|
331 |
|
---|
332 | ---------- Footnotes ----------
|
---|
333 |
|
---|
334 | (1) This applies to commands such as â=â, âaâ, âcâ, âiâ, âlâ, âpâ.
|
---|
335 | You can still write to the standard output by using the âwâ or âWâ
|
---|
336 | commands together with the â/dev/stdoutâ special file
|
---|
337 |
|
---|
338 | (2) Note that GNU âsedâ creates the backup file whether or not any
|
---|
339 | output is actually changed.
|
---|
340 |
|
---|
341 |
|
---|
342 | File: sed.info, Node: Exit status, Prev: Command-Line Options, Up: Invoking sed
|
---|
343 |
|
---|
344 | 2.3 Exit status
|
---|
345 | ===============
|
---|
346 |
|
---|
347 | An exit status of zero indicates success, and a nonzero value indicates
|
---|
348 | failure. GNU âsedâ returns the following exit status error values:
|
---|
349 |
|
---|
350 | 0
|
---|
351 | Successful completion.
|
---|
352 |
|
---|
353 | 1
|
---|
354 | Invalid command, invalid syntax, invalid regular expression or a
|
---|
355 | GNU âsedâ extension command used with â--posixâ.
|
---|
356 |
|
---|
357 | 2
|
---|
358 | One or more of the input file specified on the command line could
|
---|
359 | not be opened (e.g. if a file is not found, or read permission is
|
---|
360 | denied). Processing continued with other files.
|
---|
361 |
|
---|
362 | 4
|
---|
363 | An I/O error, or a serious processing error during runtime, GNU
|
---|
364 | âsedâ aborted immediately.
|
---|
365 |
|
---|
366 | Additionally, the commands âqâ and âQâ can be used to terminate âsedâ
|
---|
367 | with a custom exit code value (this is a GNU âsedâ extension):
|
---|
368 |
|
---|
369 | $ echo | sed 'Q42' ; echo $?
|
---|
370 | 42
|
---|
371 |
|
---|
372 |
|
---|
373 | File: sed.info, Node: sed scripts, Next: sed addresses, Prev: Invoking sed, Up: Top
|
---|
374 |
|
---|
375 | 3 âsedâ scripts
|
---|
376 | ***************
|
---|
377 |
|
---|
378 | * Menu:
|
---|
379 |
|
---|
380 | * sed script overview:: âsedâ script overview
|
---|
381 | * sed commands list:: âsedâ commands summary
|
---|
382 | * The "s" Command:: âsedââs Swiss Army Knife
|
---|
383 | * Common Commands:: Often used commands
|
---|
384 | * Other Commands:: Less frequently used commands
|
---|
385 | * Programming Commands:: Commands for âsedâ gurus
|
---|
386 | * Extended Commands:: Commands specific of GNU âsedâ
|
---|
387 | * Multiple commands syntax:: Extension for easier scripting
|
---|
388 |
|
---|
389 |
|
---|
390 | File: sed.info, Node: sed script overview, Next: sed commands list, Up: sed scripts
|
---|
391 |
|
---|
392 | 3.1 âsedâ script overview
|
---|
393 | =========================
|
---|
394 |
|
---|
395 | A âsedâ program consists of one or more âsedâ commands, passed in by one
|
---|
396 | or more of the â-eâ, â-fâ, â--expressionâ, and â--fileâ options, or the
|
---|
397 | first non-option argument if zero of these options are used. This
|
---|
398 | document will refer to âtheâ âsedâ script; this is understood to mean
|
---|
399 | the in-order concatenation of all of the SCRIPTs and SCRIPT-FILEs passed
|
---|
400 | in. *Note Overview::.
|
---|
401 |
|
---|
402 | âsedâ commands follow this syntax:
|
---|
403 |
|
---|
404 | [addr]X[options]
|
---|
405 |
|
---|
406 | X is a single-letter âsedâ command. â[addr]â is an optional line
|
---|
407 | address. If â[addr]â is specified, the command X will be executed only
|
---|
408 | on the matched lines. â[addr]â can be a single line number, a regular
|
---|
409 | expression, or a range of lines (*note sed addresses::). Additional
|
---|
410 | â[options]â are used for some âsedâ commands.
|
---|
411 |
|
---|
412 | The following example deletes lines 30 to 35 in the input. â30,35â
|
---|
413 | is an address range. âdâ is the delete command:
|
---|
414 |
|
---|
415 | sed '30,35d' input.txt > output.txt
|
---|
416 |
|
---|
417 | The following example prints all input until a line starting with the
|
---|
418 | string âfooâ is found. If such line is found, âsedâ will terminate with
|
---|
419 | exit status 42. If such line was not found (and no other error
|
---|
420 | occurred), âsedâ will exit with status 0. â/^foo/â is a
|
---|
421 | regular-expression address. âqâ is the quit command. â42â is the
|
---|
422 | command option.
|
---|
423 |
|
---|
424 | sed '/^foo/q42' input.txt > output.txt
|
---|
425 |
|
---|
426 | Commands within a SCRIPT or SCRIPT-FILE can be separated by
|
---|
427 | semicolons (â;â) or newlines (ASCII 10). Multiple scripts can be
|
---|
428 | specified with â-eâ or â-fâ options.
|
---|
429 |
|
---|
430 | The following examples are all equivalent. They perform two âsedâ
|
---|
431 | operations: deleting any lines matching the regular expression â/^foo/â,
|
---|
432 | and replacing all occurrences of the string âhelloâ with âworldâ:
|
---|
433 |
|
---|
434 | sed '/^foo/d ; s/hello/world/g' input.txt > output.txt
|
---|
435 |
|
---|
436 | sed -e '/^foo/d' -e 's/hello/world/g' input.txt > output.txt
|
---|
437 |
|
---|
438 | echo '/^foo/d' > script.sed
|
---|
439 | echo 's/hello/world/g' >> script.sed
|
---|
440 | sed -f script.sed input.txt > output.txt
|
---|
441 |
|
---|
442 | echo 's/hello/world/g' > script2.sed
|
---|
443 | sed -e '/^foo/d' -f script2.sed input.txt > output.txt
|
---|
444 |
|
---|
445 | Commands âaâ, âcâ, âiâ, due to their syntax, cannot be followed by
|
---|
446 | semicolons working as command separators and thus should be terminated
|
---|
447 | with newlines or be placed at the end of a SCRIPT or SCRIPT-FILE.
|
---|
448 | Commands can also be preceded with optional non-significant whitespace
|
---|
449 | characters. *Note Multiple commands syntax::.
|
---|
450 |
|
---|
451 |
|
---|
452 | File: sed.info, Node: sed commands list, Next: The "s" Command, Prev: sed script overview, Up: sed scripts
|
---|
453 |
|
---|
454 | 3.2 âsedâ commands summary
|
---|
455 | ==========================
|
---|
456 |
|
---|
457 | The following commands are supported in GNU âsedâ. Some are standard
|
---|
458 | POSIX commands, while other are GNU extensions. Details and examples
|
---|
459 | for each command are in the following sections. (Mnemonics) are shown
|
---|
460 | in parentheses.
|
---|
461 |
|
---|
462 | âa\â
|
---|
463 | âTEXTâ
|
---|
464 | Append TEXT after a line.
|
---|
465 |
|
---|
466 | âa TEXTâ
|
---|
467 | Append TEXT after a line (alternative syntax).
|
---|
468 |
|
---|
469 | âb LABELâ
|
---|
470 | Branch unconditionally to LABEL. The LABEL may be omitted, in
|
---|
471 | which case the next cycle is started.
|
---|
472 |
|
---|
473 | âc\â
|
---|
474 | âTEXTâ
|
---|
475 | Replace (change) lines with TEXT.
|
---|
476 |
|
---|
477 | âc TEXTâ
|
---|
478 | Replace (change) lines with TEXT (alternative syntax).
|
---|
479 |
|
---|
480 | âdâ
|
---|
481 | Delete the pattern space; immediately start next cycle.
|
---|
482 |
|
---|
483 | âDâ
|
---|
484 | If pattern space contains newlines, delete text in the pattern
|
---|
485 | space up to the first newline, and restart cycle with the resultant
|
---|
486 | pattern space, without reading a new line of input.
|
---|
487 |
|
---|
488 | If pattern space contains no newline, start a normal new cycle as
|
---|
489 | if the âdâ command was issued.
|
---|
490 |
|
---|
491 | âeâ
|
---|
492 | Executes the command that is found in pattern space and replaces
|
---|
493 | the pattern space with the output; a trailing newline is
|
---|
494 | suppressed.
|
---|
495 |
|
---|
496 | âe COMMANDâ
|
---|
497 | Executes COMMAND and sends its output to the output stream. The
|
---|
498 | command can run across multiple lines, all but the last ending with
|
---|
499 | a back-slash.
|
---|
500 |
|
---|
501 | âFâ
|
---|
502 | (filename) Print the file name of the current input file (with a
|
---|
503 | trailing newline).
|
---|
504 |
|
---|
505 | âgâ
|
---|
506 | Replace the contents of the pattern space with the contents of the
|
---|
507 | hold space.
|
---|
508 |
|
---|
509 | âGâ
|
---|
510 | Append a newline to the contents of the pattern space, and then
|
---|
511 | append the contents of the hold space to that of the pattern space.
|
---|
512 |
|
---|
513 | âhâ
|
---|
514 | (hold) Replace the contents of the hold space with the contents of
|
---|
515 | the pattern space.
|
---|
516 |
|
---|
517 | âHâ
|
---|
518 | Append a newline to the contents of the hold space, and then append
|
---|
519 | the contents of the pattern space to that of the hold space.
|
---|
520 |
|
---|
521 | âi\â
|
---|
522 | âTEXTâ
|
---|
523 | insert TEXT before a line.
|
---|
524 |
|
---|
525 | âi TEXTâ
|
---|
526 | insert TEXT before a line (alternative syntax).
|
---|
527 |
|
---|
528 | âlâ
|
---|
529 | Print the pattern space in an unambiguous form.
|
---|
530 |
|
---|
531 | ânâ
|
---|
532 | (next) If auto-print is not disabled, print the pattern space,
|
---|
533 | then, regardless, replace the pattern space with the next line of
|
---|
534 | input. If there is no more input then âsedâ exits without
|
---|
535 | processing any more commands.
|
---|
536 |
|
---|
537 | âNâ
|
---|
538 | Add a newline to the pattern space, then append the next line of
|
---|
539 | input to the pattern space. If there is no more input then âsedâ
|
---|
540 | exits without processing any more commands.
|
---|
541 |
|
---|
542 | âpâ
|
---|
543 | Print the pattern space.
|
---|
544 |
|
---|
545 | âPâ
|
---|
546 | Print the pattern space, up to the first <newline>.
|
---|
547 |
|
---|
548 | âq[EXIT-CODE]â
|
---|
549 | (quit) Exit âsedâ without processing any more commands or input.
|
---|
550 |
|
---|
551 | âQ[EXIT-CODE]â
|
---|
552 | (quit) This command is the same as âqâ, but will not print the
|
---|
553 | contents of pattern space. Like âqâ, it provides the ability to
|
---|
554 | return an exit code to the caller.
|
---|
555 |
|
---|
556 | âr filenameâ
|
---|
557 | Reads file FILENAME.
|
---|
558 |
|
---|
559 | âR filenameâ
|
---|
560 | Queue a line of FILENAME to be read and inserted into the output
|
---|
561 | stream at the end of the current cycle, or when the next input line
|
---|
562 | is read.
|
---|
563 |
|
---|
564 | âs/REGEXP/REPLACEMENT/[FLAGS]â
|
---|
565 | (substitute) Match the regular-expression against the content of
|
---|
566 | the pattern space. If found, replace matched string with
|
---|
567 | REPLACEMENT.
|
---|
568 |
|
---|
569 | ât LABELâ
|
---|
570 | (test) Branch to LABEL only if there has been a successful
|
---|
571 | âsâubstitution since the last input line was read or conditional
|
---|
572 | branch was taken. The LABEL may be omitted, in which case the next
|
---|
573 | cycle is started.
|
---|
574 |
|
---|
575 | âT LABELâ
|
---|
576 | (test) Branch to LABEL only if there have been no successful
|
---|
577 | âsâubstitutions since the last input line was read or conditional
|
---|
578 | branch was taken. The LABEL may be omitted, in which case the next
|
---|
579 | cycle is started.
|
---|
580 |
|
---|
581 | âv [VERSION]â
|
---|
582 | (version) This command does nothing, but makes âsedâ fail if GNU
|
---|
583 | âsedâ extensions are not supported, or if the requested version is
|
---|
584 | not available.
|
---|
585 |
|
---|
586 | âw filenameâ
|
---|
587 | Write the pattern space to FILENAME.
|
---|
588 |
|
---|
589 | âW filenameâ
|
---|
590 | Write to the given filename the portion of the pattern space up to
|
---|
591 | the first newline
|
---|
592 |
|
---|
593 | âxâ
|
---|
594 | Exchange the contents of the hold and pattern spaces.
|
---|
595 |
|
---|
596 | ây/src/dst/â
|
---|
597 | Transliterate any characters in the pattern space which match any
|
---|
598 | of the SOURCE-CHARS with the corresponding character in DEST-CHARS.
|
---|
599 |
|
---|
600 | âzâ
|
---|
601 | (zap) This command empties the content of pattern space.
|
---|
602 |
|
---|
603 | â#â
|
---|
604 | A comment, until the next newline.
|
---|
605 |
|
---|
606 | â{ CMD ; CMD ... }â
|
---|
607 | Group several commands together.
|
---|
608 |
|
---|
609 | â=â
|
---|
610 | Print the current input line number (with a trailing newline).
|
---|
611 |
|
---|
612 | â: LABELâ
|
---|
613 | Specify the location of LABEL for branch commands (âbâ, âtâ, âTâ).
|
---|
614 |
|
---|
615 |
|
---|
616 | File: sed.info, Node: The "s" Command, Next: Common Commands, Prev: sed commands list, Up: sed scripts
|
---|
617 |
|
---|
618 | 3.3 The âsâ Command
|
---|
619 | ===================
|
---|
620 |
|
---|
621 | The âsâ command (as in substitute) is probably the most important in
|
---|
622 | âsedâ and has a lot of different options. The syntax of the âsâ command
|
---|
623 | is âs/REGEXP/REPLACEMENT/FLAGSâ.
|
---|
624 |
|
---|
625 | Its basic concept is simple: the âsâ command attempts to match the
|
---|
626 | pattern space against the supplied regular expression REGEXP; if the
|
---|
627 | match is successful, then that portion of the pattern space which was
|
---|
628 | matched is replaced with REPLACEMENT.
|
---|
629 |
|
---|
630 | For details about REGEXP syntax *note Regular Expression Addresses:
|
---|
631 | Regexp Addresses.
|
---|
632 |
|
---|
633 | The REPLACEMENT can contain â\Nâ (N being a number from 1 to 9,
|
---|
634 | inclusive) references, which refer to the portion of the match which is
|
---|
635 | contained between the Nth â\(â and its matching â\)â. Also, the
|
---|
636 | REPLACEMENT can contain unescaped â&â characters which reference the
|
---|
637 | whole matched portion of the pattern space.
|
---|
638 |
|
---|
639 | The â/â characters may be uniformly replaced by any other single
|
---|
640 | character within any given âsâ command. The â/â character (or whatever
|
---|
641 | other character is used in its stead) can appear in the REGEXP or
|
---|
642 | REPLACEMENT only if it is preceded by a â\â character.
|
---|
643 |
|
---|
644 | Finally, as a GNU âsedâ extension, you can include a special sequence
|
---|
645 | made of a backslash and one of the letters âLâ, âlâ, âUâ, âuâ, or âEâ.
|
---|
646 | The meaning is as follows:
|
---|
647 |
|
---|
648 | â\Lâ
|
---|
649 | Turn the replacement to lowercase until a â\Uâ or â\Eâ is found,
|
---|
650 |
|
---|
651 | â\lâ
|
---|
652 | Turn the next character to lowercase,
|
---|
653 |
|
---|
654 | â\Uâ
|
---|
655 | Turn the replacement to uppercase until a â\Lâ or â\Eâ is found,
|
---|
656 |
|
---|
657 | â\uâ
|
---|
658 | Turn the next character to uppercase,
|
---|
659 |
|
---|
660 | â\Eâ
|
---|
661 | Stop case conversion started by â\Lâ or â\Uâ.
|
---|
662 |
|
---|
663 | When the âgâ flag is being used, case conversion does not propagate
|
---|
664 | from one occurrence of the regular expression to another. For example,
|
---|
665 | when the following command is executed with âa-b-â in pattern space:
|
---|
666 | s/\(b\?\)-/x\u\1/g
|
---|
667 |
|
---|
668 | the output is âaxxBâ. When replacing the first â-â, the â\uâ sequence
|
---|
669 | only affects the empty replacement of â\1â. It does not affect the âxâ
|
---|
670 | character that is added to pattern space when replacing âb-â with âxBâ.
|
---|
671 |
|
---|
672 | On the other hand, â\lâ and â\uâ do affect the remainder of the
|
---|
673 | replacement text if they are followed by an empty substitution. With
|
---|
674 | âa-b-â in pattern space, the following command:
|
---|
675 | s/\(b\?\)-/\u\1x/g
|
---|
676 |
|
---|
677 | will replace â-â with âXâ (uppercase) and âb-â with âBxâ. If this
|
---|
678 | behavior is undesirable, you can prevent it by adding a â\Eâ
|
---|
679 | sequenceâafter â\1â in this case.
|
---|
680 |
|
---|
681 | To include a literal â\â, â&â, or newline in the final replacement,
|
---|
682 | be sure to precede the desired â\â, â&â, or newline in the REPLACEMENT
|
---|
683 | with a â\â.
|
---|
684 |
|
---|
685 | The âsâ command can be followed by zero or more of the following
|
---|
686 | FLAGS:
|
---|
687 |
|
---|
688 | âgâ
|
---|
689 | Apply the replacement to _all_ matches to the REGEXP, not just the
|
---|
690 | first.
|
---|
691 |
|
---|
692 | âNUMBERâ
|
---|
693 | Only replace the NUMBERth match of the REGEXP.
|
---|
694 |
|
---|
695 | interaction in âsâ command Note: the POSIX standard does not
|
---|
696 | specify what should happen when you mix the âgâ and NUMBER
|
---|
697 | modifiers, and currently there is no widely agreed upon meaning
|
---|
698 | across âsedâ implementations. For GNU âsedâ, the interaction is
|
---|
699 | defined to be: ignore matches before the NUMBERth, and then match
|
---|
700 | and replace all matches from the NUMBERth on.
|
---|
701 |
|
---|
702 | âpâ
|
---|
703 | If the substitution was made, then print the new pattern space.
|
---|
704 |
|
---|
705 | Note: when both the âpâ and âeâ options are specified, the relative
|
---|
706 | ordering of the two produces very different results. In general,
|
---|
707 | âepâ (evaluate then print) is what you want, but operating the
|
---|
708 | other way round can be useful for debugging. For this reason, the
|
---|
709 | current version of GNU âsedâ interprets specially the presence of
|
---|
710 | âpâ options both before and after âeâ, printing the pattern space
|
---|
711 | before and after evaluation, while in general flags for the âsâ
|
---|
712 | command show their effect just once. This behavior, although
|
---|
713 | documented, might change in future versions.
|
---|
714 |
|
---|
715 | âw FILENAMEâ
|
---|
716 | If the substitution was made, then write out the result to the
|
---|
717 | named file. As a GNU âsedâ extension, two special values of
|
---|
718 | FILENAME are supported: â/dev/stderrâ, which writes the result to
|
---|
719 | the standard error, and â/dev/stdoutâ, which writes to the standard
|
---|
720 | output.(1)
|
---|
721 |
|
---|
722 | âeâ
|
---|
723 | This command allows one to pipe input from a shell command into
|
---|
724 | pattern space. If a substitution was made, the command that is
|
---|
725 | found in pattern space is executed and pattern space is replaced
|
---|
726 | with its output. A trailing newline is suppressed; results are
|
---|
727 | undefined if the command to be executed contains a NUL character.
|
---|
728 | This is a GNU âsedâ extension.
|
---|
729 |
|
---|
730 | âIâ
|
---|
731 | âiâ
|
---|
732 | The âIâ modifier to regular-expression matching is a GNU extension
|
---|
733 | which makes âsedâ match REGEXP in a case-insensitive manner.
|
---|
734 |
|
---|
735 | âMâ
|
---|
736 | âmâ
|
---|
737 | The âMâ modifier to regular-expression matching is a GNU âsedâ
|
---|
738 | extension which directs GNU âsedâ to match the regular expression
|
---|
739 | in âmulti-lineâ mode. The modifier causes â^â and â$â to match
|
---|
740 | respectively (in addition to the normal behavior) the empty string
|
---|
741 | after a newline, and the empty string before a newline. There are
|
---|
742 | special character sequences (â\`â and â\'â) which always match the
|
---|
743 | beginning or the end of the buffer. In addition, the period
|
---|
744 | character does not match a new-line character in multi-line mode.
|
---|
745 |
|
---|
746 | ---------- Footnotes ----------
|
---|
747 |
|
---|
748 | (1) This is equivalent to âpâ unless the â-iâ option is being used.
|
---|
749 |
|
---|
750 |
|
---|
751 | File: sed.info, Node: Common Commands, Next: Other Commands, Prev: The "s" Command, Up: sed scripts
|
---|
752 |
|
---|
753 | 3.4 Often-Used Commands
|
---|
754 | =======================
|
---|
755 |
|
---|
756 | If you use âsedâ at all, you will quite likely want to know these
|
---|
757 | commands.
|
---|
758 |
|
---|
759 | â#â
|
---|
760 | [No addresses allowed.]
|
---|
761 |
|
---|
762 | The â#â character begins a comment; the comment continues until the
|
---|
763 | next newline.
|
---|
764 |
|
---|
765 | If you are concerned about portability, be aware that some
|
---|
766 | implementations of âsedâ (which are not POSIX conforming) may only
|
---|
767 | support a single one-line comment, and then only when the very
|
---|
768 | first character of the script is a â#â.
|
---|
769 |
|
---|
770 | Warning: if the first two characters of the âsedâ script are â#nâ,
|
---|
771 | then the â-nâ (no-autoprint) option is forced. If you want to put
|
---|
772 | a comment in the first line of your script and that comment begins
|
---|
773 | with the letter ânâ and you do not want this behavior, then be sure
|
---|
774 | to either use a capital âNâ, or place at least one space before the
|
---|
775 | ânâ.
|
---|
776 |
|
---|
777 | âq [EXIT-CODE]â
|
---|
778 | Exit âsedâ without processing any more commands or input.
|
---|
779 |
|
---|
780 | Example: stop after printing the second line:
|
---|
781 | $ seq 3 | sed 2q
|
---|
782 | 1
|
---|
783 | 2
|
---|
784 |
|
---|
785 | This command accepts only one address. Note that the current
|
---|
786 | pattern space is printed if auto-print is not disabled with the
|
---|
787 | â-nâ options. The ability to return an exit code from the âsedâ
|
---|
788 | script is a GNU âsedâ extension.
|
---|
789 |
|
---|
790 | See also the GNU âsedâ extension âQâ command which quits silently
|
---|
791 | without printing the current pattern space.
|
---|
792 |
|
---|
793 | âdâ
|
---|
794 | Delete the pattern space; immediately start next cycle.
|
---|
795 |
|
---|
796 | Example: delete the second input line:
|
---|
797 | $ seq 3 | sed 2d
|
---|
798 | 1
|
---|
799 | 3
|
---|
800 |
|
---|
801 | âpâ
|
---|
802 | Print out the pattern space (to the standard output). This command
|
---|
803 | is usually only used in conjunction with the â-nâ command-line
|
---|
804 | option.
|
---|
805 |
|
---|
806 | Example: print only the second input line:
|
---|
807 | $ seq 3 | sed -n 2p
|
---|
808 | 2
|
---|
809 |
|
---|
810 | ânâ
|
---|
811 | If auto-print is not disabled, print the pattern space, then,
|
---|
812 | regardless, replace the pattern space with the next line of input.
|
---|
813 | If there is no more input then âsedâ exits without processing any
|
---|
814 | more commands.
|
---|
815 |
|
---|
816 | This command is useful to skip lines (e.g. process every Nth
|
---|
817 | line).
|
---|
818 |
|
---|
819 | Example: perform substitution on every 3rd line (i.e. two ânâ
|
---|
820 | commands skip two lines):
|
---|
821 | $ seq 6 | sed 'n;n;s/./x/'
|
---|
822 | 1
|
---|
823 | 2
|
---|
824 | x
|
---|
825 | 4
|
---|
826 | 5
|
---|
827 | x
|
---|
828 |
|
---|
829 | GNU âsedâ provides an extension address syntax of FIRST~STEP to
|
---|
830 | achieve the same result:
|
---|
831 |
|
---|
832 | $ seq 6 | sed '0~3s/./x/'
|
---|
833 | 1
|
---|
834 | 2
|
---|
835 | x
|
---|
836 | 4
|
---|
837 | 5
|
---|
838 | x
|
---|
839 |
|
---|
840 | â{ COMMANDS }â
|
---|
841 | A group of commands may be enclosed between â{â and â}â characters.
|
---|
842 | This is particularly useful when you want a group of commands to be
|
---|
843 | triggered by a single address (or address-range) match.
|
---|
844 |
|
---|
845 | Example: perform substitution then print the second input line:
|
---|
846 | $ seq 3 | sed -n '2{s/2/X/ ; p}'
|
---|
847 | X
|
---|
848 |
|
---|
849 |
|
---|
850 | File: sed.info, Node: Other Commands, Next: Programming Commands, Prev: Common Commands, Up: sed scripts
|
---|
851 |
|
---|
852 | 3.5 Less Frequently-Used Commands
|
---|
853 | =================================
|
---|
854 |
|
---|
855 | Though perhaps less frequently used than those in the previous section,
|
---|
856 | some very small yet useful âsedâ scripts can be built with these
|
---|
857 | commands.
|
---|
858 |
|
---|
859 | ây/SOURCE-CHARS/DEST-CHARS/â
|
---|
860 | Transliterate any characters in the pattern space which match any
|
---|
861 | of the SOURCE-CHARS with the corresponding character in DEST-CHARS.
|
---|
862 |
|
---|
863 | Example: transliterate âa-jâ into â0-9â:
|
---|
864 | $ echo hello world | sed 'y/abcdefghij/0123456789/'
|
---|
865 | 74llo worl3
|
---|
866 |
|
---|
867 | (The â/â characters may be uniformly replaced by any other single
|
---|
868 | character within any given âyâ command.)
|
---|
869 |
|
---|
870 | Instances of the â/â (or whatever other character is used in its
|
---|
871 | stead), â\â, or newlines can appear in the SOURCE-CHARS or
|
---|
872 | DEST-CHARS lists, provide that each instance is escaped by a â\â.
|
---|
873 | The SOURCE-CHARS and DEST-CHARS lists _must_ contain the same
|
---|
874 | number of characters (after de-escaping).
|
---|
875 |
|
---|
876 | See the âtrâ command from GNU coreutils for similar functionality.
|
---|
877 |
|
---|
878 | âa TEXTâ
|
---|
879 | Appending TEXT after a line. This is a GNU extension to the
|
---|
880 | standard âaâ command - see below for details.
|
---|
881 |
|
---|
882 | Example: Add âhelloâ after the second line:
|
---|
883 | $ seq 3 | sed '2a hello'
|
---|
884 | 1
|
---|
885 | 2
|
---|
886 | hello
|
---|
887 | 3
|
---|
888 |
|
---|
889 | Leading whitespace after the âaâ command is ignored. The text to
|
---|
890 | add is read until the end of the line.
|
---|
891 |
|
---|
892 | âa\â
|
---|
893 | âTEXTâ
|
---|
894 | Appending TEXT after a line.
|
---|
895 |
|
---|
896 | Example: Add âhelloâ after the second line (⣠indicates printed
|
---|
897 | output lines):
|
---|
898 | $ seq 3 | sed '2a\
|
---|
899 | hello'
|
---|
900 | â£1
|
---|
901 | â£2
|
---|
902 | â£hello
|
---|
903 | â£3
|
---|
904 |
|
---|
905 | The âaâ command queues the lines of text which follow this command
|
---|
906 | (each but the last ending with a â\â, which are removed from the
|
---|
907 | output) to be output at the end of the current cycle, or when the
|
---|
908 | next input line is read.
|
---|
909 |
|
---|
910 | As a GNU extension, this command accepts two addresses.
|
---|
911 |
|
---|
912 | Escape sequences in TEXT are processed, so you should use â\\â in
|
---|
913 | TEXT to print a single backslash.
|
---|
914 |
|
---|
915 | The commands resume after the last line without a backslash (â\â) -
|
---|
916 | âworldâ in the following example:
|
---|
917 | $ seq 3 | sed '2a\
|
---|
918 | hello\
|
---|
919 | world
|
---|
920 | 3s/./X/'
|
---|
921 | â£1
|
---|
922 | â£2
|
---|
923 | â£hello
|
---|
924 | â£world
|
---|
925 | â£X
|
---|
926 |
|
---|
927 | As a GNU extension, the âaâ command and TEXT can be separated into
|
---|
928 | two â-eâ parameters, enabling easier scripting:
|
---|
929 | $ seq 3 | sed -e '2a\' -e hello
|
---|
930 | 1
|
---|
931 | 2
|
---|
932 | hello
|
---|
933 | 3
|
---|
934 |
|
---|
935 | $ sed -e '2a\' -e "$VAR"
|
---|
936 |
|
---|
937 | âi TEXTâ
|
---|
938 | insert TEXT before a line. This is a GNU extension to the standard
|
---|
939 | âiâ command - see below for details.
|
---|
940 |
|
---|
941 | Example: Insert âhelloâ before the second line:
|
---|
942 | $ seq 3 | sed '2i hello'
|
---|
943 | 1
|
---|
944 | hello
|
---|
945 | 2
|
---|
946 | 3
|
---|
947 |
|
---|
948 | Leading whitespace after the âiâ command is ignored. The text to
|
---|
949 | add is read until the end of the line.
|
---|
950 |
|
---|
951 | âi\â
|
---|
952 | âTEXTâ
|
---|
953 | Immediately output the lines of text which follow this command.
|
---|
954 |
|
---|
955 | Example: Insert âhelloâ before the second line (⣠indicates printed
|
---|
956 | output lines):
|
---|
957 | $ seq 3 | sed '2i\
|
---|
958 | hello'
|
---|
959 | â£1
|
---|
960 | â£hello
|
---|
961 | â£2
|
---|
962 | â£3
|
---|
963 |
|
---|
964 | As a GNU extension, this command accepts two addresses.
|
---|
965 |
|
---|
966 | Escape sequences in TEXT are processed, so you should use â\\â in
|
---|
967 | TEXT to print a single backslash.
|
---|
968 |
|
---|
969 | The commands resume after the last line without a backslash (â\â) -
|
---|
970 | âworldâ in the following example:
|
---|
971 | $ seq 3 | sed '2i\
|
---|
972 | hello\
|
---|
973 | world
|
---|
974 | s/./X/'
|
---|
975 | â£X
|
---|
976 | â£hello
|
---|
977 | â£world
|
---|
978 | â£X
|
---|
979 | â£X
|
---|
980 |
|
---|
981 | As a GNU extension, the âiâ command and TEXT can be separated into
|
---|
982 | two â-eâ parameters, enabling easier scripting:
|
---|
983 | $ seq 3 | sed -e '2i\' -e hello
|
---|
984 | 1
|
---|
985 | hello
|
---|
986 | 2
|
---|
987 | 3
|
---|
988 |
|
---|
989 | $ sed -e '2i\' -e "$VAR"
|
---|
990 |
|
---|
991 | âc TEXTâ
|
---|
992 | Replaces the line(s) with TEXT. This is a GNU extension to the
|
---|
993 | standard âcâ command - see below for details.
|
---|
994 |
|
---|
995 | Example: Replace the 2nd to 9th lines with the word âhelloâ:
|
---|
996 | $ seq 10 | sed '2,9c hello'
|
---|
997 | 1
|
---|
998 | hello
|
---|
999 | 10
|
---|
1000 |
|
---|
1001 | Leading whitespace after the âcâ command is ignored. The text to
|
---|
1002 | add is read until the end of the line.
|
---|
1003 |
|
---|
1004 | âc\â
|
---|
1005 | âTEXTâ
|
---|
1006 | Delete the lines matching the address or address-range, and output
|
---|
1007 | the lines of text which follow this command.
|
---|
1008 |
|
---|
1009 | Example: Replace 2nd to 4th lines with the words âhelloâ and
|
---|
1010 | âworldâ (⣠indicates printed output lines):
|
---|
1011 | $ seq 5 | sed '2,4c\
|
---|
1012 | hello\
|
---|
1013 | world'
|
---|
1014 | â£1
|
---|
1015 | â£hello
|
---|
1016 | â£world
|
---|
1017 | â£5
|
---|
1018 |
|
---|
1019 | If no addresses are given, each line is replaced.
|
---|
1020 |
|
---|
1021 | A new cycle is started after this command is done, since the
|
---|
1022 | pattern space will have been deleted. In the following example,
|
---|
1023 | the âcâ starts a new cycle and the substitution command is not
|
---|
1024 | performed on the replaced text:
|
---|
1025 |
|
---|
1026 | $ seq 3 | sed '2c\
|
---|
1027 | hello
|
---|
1028 | s/./X/'
|
---|
1029 | â£X
|
---|
1030 | â£hello
|
---|
1031 | â£X
|
---|
1032 |
|
---|
1033 | As a GNU extension, the âcâ command and TEXT can be separated into
|
---|
1034 | two â-eâ parameters, enabling easier scripting:
|
---|
1035 | $ seq 3 | sed -e '2c\' -e hello
|
---|
1036 | 1
|
---|
1037 | hello
|
---|
1038 | 3
|
---|
1039 |
|
---|
1040 | $ sed -e '2c\' -e "$VAR"
|
---|
1041 |
|
---|
1042 | â=â
|
---|
1043 | Print out the current input line number (with a trailing newline).
|
---|
1044 |
|
---|
1045 | $ printf '%s\n' aaa bbb ccc | sed =
|
---|
1046 | 1
|
---|
1047 | aaa
|
---|
1048 | 2
|
---|
1049 | bbb
|
---|
1050 | 3
|
---|
1051 | ccc
|
---|
1052 |
|
---|
1053 | As a GNU extension, this command accepts two addresses.
|
---|
1054 |
|
---|
1055 | âl Nâ
|
---|
1056 | Print the pattern space in an unambiguous form: non-printable
|
---|
1057 | characters (and the â\â character) are printed in C-style escaped
|
---|
1058 | form; long lines are split, with a trailing â\â character to
|
---|
1059 | indicate the split; the end of each line is marked with a â$â.
|
---|
1060 |
|
---|
1061 | N specifies the desired line-wrap length; a length of 0 (zero)
|
---|
1062 | means to never wrap long lines. If omitted, the default as
|
---|
1063 | specified on the command line is used. The N parameter is a GNU
|
---|
1064 | âsedâ extension.
|
---|
1065 |
|
---|
1066 | âr FILENAMEâ
|
---|
1067 |
|
---|
1068 | Reads file FILENAME. Example:
|
---|
1069 |
|
---|
1070 | $ seq 3 | sed '2r/etc/hostname'
|
---|
1071 | 1
|
---|
1072 | 2
|
---|
1073 | fencepost.gnu.org
|
---|
1074 | 3
|
---|
1075 |
|
---|
1076 | Queue the contents of FILENAME to be read and inserted into the
|
---|
1077 | output stream at the end of the current cycle, or when the next
|
---|
1078 | input line is read. Note that if FILENAME cannot be read, it is
|
---|
1079 | treated as if it were an empty file, without any error indication.
|
---|
1080 |
|
---|
1081 | As a GNU âsedâ extension, the special value â/dev/stdinâ is
|
---|
1082 | supported for the file name, which reads the contents of the
|
---|
1083 | standard input.
|
---|
1084 |
|
---|
1085 | As a GNU extension, this command accepts two addresses. The file
|
---|
1086 | will then be reread and inserted on each of the addressed lines.
|
---|
1087 |
|
---|
1088 | As a GNU âsedâ extension, the ârâ command accepts a zero address,
|
---|
1089 | inserting a file _before_ the first line of the input *note Adding
|
---|
1090 | a header to multiple files::.
|
---|
1091 |
|
---|
1092 | âw FILENAMEâ
|
---|
1093 | Write the pattern space to FILENAME. As a GNU âsedâ extension, two
|
---|
1094 | special values of FILENAME are supported: â/dev/stderrâ, which
|
---|
1095 | writes the result to the standard error, and â/dev/stdoutâ, which
|
---|
1096 | writes to the standard output.(1)
|
---|
1097 |
|
---|
1098 | The file will be created (or truncated) before the first input line
|
---|
1099 | is read; all âwâ commands (including instances of the âwâ flag on
|
---|
1100 | successful âsâ commands) which refer to the same FILENAME are
|
---|
1101 | output without closing and reopening the file.
|
---|
1102 |
|
---|
1103 | âDâ
|
---|
1104 | If pattern space contains no newline, start a normal new cycle as
|
---|
1105 | if the âdâ command was issued. Otherwise, delete text in the
|
---|
1106 | pattern space up to the first newline, and restart cycle with the
|
---|
1107 | resultant pattern space, without reading a new line of input.
|
---|
1108 |
|
---|
1109 | âNâ
|
---|
1110 | Add a newline to the pattern space, then append the next line of
|
---|
1111 | input to the pattern space. If there is no more input then âsedâ
|
---|
1112 | exits without processing any more commands.
|
---|
1113 |
|
---|
1114 | When â-zâ is used, a zero byte (the ascii âNULâ character) is added
|
---|
1115 | between the lines (instead of a new line).
|
---|
1116 |
|
---|
1117 | By default âsedâ does not terminate if there is no ânextâ input
|
---|
1118 | line. This is a GNU extension which can be disabled with
|
---|
1119 | â--posixâ. *Note N command on the last line: N_command_last_line.
|
---|
1120 |
|
---|
1121 | âPâ
|
---|
1122 | Print out the portion of the pattern space up to the first newline.
|
---|
1123 |
|
---|
1124 | âhâ
|
---|
1125 | Replace the contents of the hold space with the contents of the
|
---|
1126 | pattern space.
|
---|
1127 |
|
---|
1128 | âHâ
|
---|
1129 | Append a newline to the contents of the hold space, and then append
|
---|
1130 | the contents of the pattern space to that of the hold space.
|
---|
1131 |
|
---|
1132 | âgâ
|
---|
1133 | Replace the contents of the pattern space with the contents of the
|
---|
1134 | hold space.
|
---|
1135 |
|
---|
1136 | âGâ
|
---|
1137 | Append a newline to the contents of the pattern space, and then
|
---|
1138 | append the contents of the hold space to that of the pattern space.
|
---|
1139 |
|
---|
1140 | âxâ
|
---|
1141 | Exchange the contents of the hold and pattern spaces.
|
---|
1142 |
|
---|
1143 | ---------- Footnotes ----------
|
---|
1144 |
|
---|
1145 | (1) This is equivalent to âpâ unless the â-iâ option is being used.
|
---|
1146 |
|
---|
1147 |
|
---|
1148 | File: sed.info, Node: Programming Commands, Next: Extended Commands, Prev: Other Commands, Up: sed scripts
|
---|
1149 |
|
---|
1150 | 3.6 Commands for âsedâ gurus
|
---|
1151 | ============================
|
---|
1152 |
|
---|
1153 | In most cases, use of these commands indicates that you are probably
|
---|
1154 | better off programming in something like âawkâ or Perl. But
|
---|
1155 | occasionally one is committed to sticking with âsedâ, and these commands
|
---|
1156 | can enable one to write quite convoluted scripts.
|
---|
1157 |
|
---|
1158 | â: LABELâ
|
---|
1159 | [No addresses allowed.]
|
---|
1160 |
|
---|
1161 | Specify the location of LABEL for branch commands. In all other
|
---|
1162 | respects, a no-op.
|
---|
1163 |
|
---|
1164 | âb LABELâ
|
---|
1165 | Unconditionally branch to LABEL. The LABEL may be omitted, in
|
---|
1166 | which case the next cycle is started.
|
---|
1167 |
|
---|
1168 | ât LABELâ
|
---|
1169 | Branch to LABEL only if there has been a successful âsâubstitution
|
---|
1170 | since the last input line was read or conditional branch was taken.
|
---|
1171 | The LABEL may be omitted, in which case the next cycle is started.
|
---|
1172 |
|
---|
1173 |
|
---|
1174 | File: sed.info, Node: Extended Commands, Next: Multiple commands syntax, Prev: Programming Commands, Up: sed scripts
|
---|
1175 |
|
---|
1176 | 3.7 Commands Specific to GNU âsedâ
|
---|
1177 | ==================================
|
---|
1178 |
|
---|
1179 | These commands are specific to GNU âsedâ, so you must use them with care
|
---|
1180 | and only when you are sure that hindering portability is not evil. They
|
---|
1181 | allow you to check for GNU âsedâ extensions or to do tasks that are
|
---|
1182 | required quite often, yet are unsupported by standard âsedâs.
|
---|
1183 |
|
---|
1184 | âe [COMMAND]â
|
---|
1185 | This command allows one to pipe input from a shell command into
|
---|
1186 | pattern space. Without parameters, the âeâ command executes the
|
---|
1187 | command that is found in pattern space and replaces the pattern
|
---|
1188 | space with the output; a trailing newline is suppressed.
|
---|
1189 |
|
---|
1190 | If a parameter is specified, instead, the âeâ command interprets it
|
---|
1191 | as a command and sends its output to the output stream. The
|
---|
1192 | command can run across multiple lines, all but the last ending with
|
---|
1193 | a back-slash.
|
---|
1194 |
|
---|
1195 | In both cases, the results are undefined if the command to be
|
---|
1196 | executed contains a NUL character.
|
---|
1197 |
|
---|
1198 | Note that, unlike the ârâ command, the output of the command will
|
---|
1199 | be printed immediately; the ârâ command instead delays the output
|
---|
1200 | to the end of the current cycle.
|
---|
1201 |
|
---|
1202 | âFâ
|
---|
1203 | Print out the file name of the current input file (with a trailing
|
---|
1204 | newline).
|
---|
1205 |
|
---|
1206 | âQ [EXIT-CODE]â
|
---|
1207 | This command accepts only one address.
|
---|
1208 |
|
---|
1209 | This command is the same as âqâ, but will not print the contents of
|
---|
1210 | pattern space. Like âqâ, it provides the ability to return an exit
|
---|
1211 | code to the caller.
|
---|
1212 |
|
---|
1213 | This command can be useful because the only alternative ways to
|
---|
1214 | accomplish this apparently trivial function are to use the â-nâ
|
---|
1215 | option (which can unnecessarily complicate your script) or
|
---|
1216 | resorting to the following snippet, which wastes time by reading
|
---|
1217 | the whole file without any visible effect:
|
---|
1218 |
|
---|
1219 | :eat
|
---|
1220 | $d Quit silently on the last line
|
---|
1221 | N Read another line, silently
|
---|
1222 | g Overwrite pattern space each time to save memory
|
---|
1223 | b eat
|
---|
1224 |
|
---|
1225 | âR FILENAMEâ
|
---|
1226 | Queue a line of FILENAME to be read and inserted into the output
|
---|
1227 | stream at the end of the current cycle, or when the next input line
|
---|
1228 | is read. Note that if FILENAME cannot be read, or if its end is
|
---|
1229 | reached, no line is appended, without any error indication.
|
---|
1230 |
|
---|
1231 | As with the ârâ command, the special value â/dev/stdinâ is
|
---|
1232 | supported for the file name, which reads a line from the standard
|
---|
1233 | input.
|
---|
1234 |
|
---|
1235 | âT LABELâ
|
---|
1236 | Branch to LABEL only if there have been no successful
|
---|
1237 | âsâubstitutions since the last input line was read or conditional
|
---|
1238 | branch was taken. The LABEL may be omitted, in which case the next
|
---|
1239 | cycle is started.
|
---|
1240 |
|
---|
1241 | âv VERSIONâ
|
---|
1242 | This command does nothing, but makes âsedâ fail if GNU âsedâ
|
---|
1243 | extensions are not supported, simply because other versions of
|
---|
1244 | âsedâ do not implement it. In addition, you can specify the
|
---|
1245 | version of âsedâ that your script requires, such as â4.0.5â. The
|
---|
1246 | default is â4.0â because that is the first version that implemented
|
---|
1247 | this command.
|
---|
1248 |
|
---|
1249 | This command enables all GNU extensions even if âPOSIXLY_CORRECTâ
|
---|
1250 | is set in the environment.
|
---|
1251 |
|
---|
1252 | âW FILENAMEâ
|
---|
1253 | Write to the given filename the portion of the pattern space up to
|
---|
1254 | the first newline. Everything said under the âwâ command about
|
---|
1255 | file handling holds here too.
|
---|
1256 |
|
---|
1257 | âzâ
|
---|
1258 | This command empties the content of pattern space. It is usually
|
---|
1259 | the same as âs/.*//â, but is more efficient and works in the
|
---|
1260 | presence of invalid multibyte sequences in the input stream. POSIX
|
---|
1261 | mandates that such sequences are _not_ matched by â.â, so that
|
---|
1262 | there is no portable way to clear âsedââs buffers in the middle of
|
---|
1263 | the script in most multibyte locales (including UTF-8 locales).
|
---|
1264 |
|
---|
1265 |
|
---|
1266 | File: sed.info, Node: Multiple commands syntax, Prev: Extended Commands, Up: sed scripts
|
---|
1267 |
|
---|
1268 | 3.8 Multiple commands syntax
|
---|
1269 | ============================
|
---|
1270 |
|
---|
1271 | There are several methods to specify multiple commands in a âsedâ
|
---|
1272 | program.
|
---|
1273 |
|
---|
1274 | Using newlines is most natural when running a sed script from a file
|
---|
1275 | (using the â-fâ option).
|
---|
1276 |
|
---|
1277 | On the command line, all âsedâ commands may be separated by newlines.
|
---|
1278 | Alternatively, you may specify each command as an argument to an â-eâ
|
---|
1279 | option:
|
---|
1280 |
|
---|
1281 | $ seq 6 | sed '1d
|
---|
1282 | 3d
|
---|
1283 | 5d'
|
---|
1284 | 2
|
---|
1285 | 4
|
---|
1286 | 6
|
---|
1287 |
|
---|
1288 | $ seq 6 | sed -e 1d -e 3d -e 5d
|
---|
1289 | 2
|
---|
1290 | 4
|
---|
1291 | 6
|
---|
1292 |
|
---|
1293 | A semicolon (â;â) may be used to separate most simple commands:
|
---|
1294 |
|
---|
1295 | $ seq 6 | sed '1d;3d;5d'
|
---|
1296 | 2
|
---|
1297 | 4
|
---|
1298 | 6
|
---|
1299 |
|
---|
1300 | The â{â,â}â,âbâ,âtâ,âTâ,â:â commands can be separated with a
|
---|
1301 | semicolon (this is a non-portable GNU âsedâ extension).
|
---|
1302 |
|
---|
1303 | $ seq 4 | sed '{1d;3d}'
|
---|
1304 | 2
|
---|
1305 | 4
|
---|
1306 |
|
---|
1307 | $ seq 6 | sed '{1d;3d};5d'
|
---|
1308 | 2
|
---|
1309 | 4
|
---|
1310 | 6
|
---|
1311 |
|
---|
1312 | Labels used in âbâ,âtâ,âTâ,â:â commands are read until a semicolon.
|
---|
1313 | Leading and trailing whitespace is ignored. In the examples below the
|
---|
1314 | label is âxâ. The first example works with GNU âsedâ. The second is a
|
---|
1315 | portable equivalent. For more information about branching and labels
|
---|
1316 | *note Branching and flow control::.
|
---|
1317 |
|
---|
1318 | $ seq 3 | sed '/1/b x ; s/^/=/ ; :x ; 3d'
|
---|
1319 | 1
|
---|
1320 | =2
|
---|
1321 |
|
---|
1322 | $ seq 3 | sed -e '/1/bx' -e 's/^/=/' -e ':x' -e '3d'
|
---|
1323 | 1
|
---|
1324 | =2
|
---|
1325 |
|
---|
1326 | 3.8.1 Commands Requiring a newline
|
---|
1327 | ----------------------------------
|
---|
1328 |
|
---|
1329 | The following commands cannot be separated by a semicolon and require a
|
---|
1330 | newline:
|
---|
1331 |
|
---|
1332 | âaâ,âcâ,âiâ (append/change/insert)
|
---|
1333 |
|
---|
1334 | All characters following âaâ,âcâ,âiâ commands are taken as the text
|
---|
1335 | to append/change/insert. Using a semicolon leads to undesirable
|
---|
1336 | results:
|
---|
1337 |
|
---|
1338 | $ seq 2 | sed '1aHello ; 2d'
|
---|
1339 | 1
|
---|
1340 | Hello ; 2d
|
---|
1341 | 2
|
---|
1342 |
|
---|
1343 | Separate the commands using â-eâ or a newline:
|
---|
1344 |
|
---|
1345 | $ seq 2 | sed -e 1aHello -e 2d
|
---|
1346 | 1
|
---|
1347 | Hello
|
---|
1348 |
|
---|
1349 | $ seq 2 | sed '1aHello
|
---|
1350 | 2d'
|
---|
1351 | 1
|
---|
1352 | Hello
|
---|
1353 |
|
---|
1354 | Note that specifying the text to add (âHelloâ) immediately after
|
---|
1355 | âaâ,âcâ,âiâ is itself a GNU âsedâ extension. A portable,
|
---|
1356 | POSIX-compliant alternative is:
|
---|
1357 |
|
---|
1358 | $ seq 2 | sed '1a\
|
---|
1359 | Hello
|
---|
1360 | 2d'
|
---|
1361 | 1
|
---|
1362 | Hello
|
---|
1363 |
|
---|
1364 | â#â (comment)
|
---|
1365 |
|
---|
1366 | All characters following â#â until the next newline are ignored.
|
---|
1367 |
|
---|
1368 | $ seq 3 | sed '# this is a comment ; 2d'
|
---|
1369 | 1
|
---|
1370 | 2
|
---|
1371 | 3
|
---|
1372 |
|
---|
1373 |
|
---|
1374 | $ seq 3 | sed '# this is a comment
|
---|
1375 | 2d'
|
---|
1376 | 1
|
---|
1377 | 3
|
---|
1378 |
|
---|
1379 | ârâ,âRâ,âwâ,âWâ (reading and writing files)
|
---|
1380 |
|
---|
1381 | The ârâ,âRâ,âwâ,âWâ commands parse the filename until end of the
|
---|
1382 | line. If whitespace, comments or semicolons are found, they will
|
---|
1383 | be included in the filename, leading to unexpected results:
|
---|
1384 |
|
---|
1385 | $ seq 2 | sed '1w hello.txt ; 2d'
|
---|
1386 | 1
|
---|
1387 | 2
|
---|
1388 |
|
---|
1389 | $ ls -log
|
---|
1390 | total 4
|
---|
1391 | -rw-rw-r-- 1 2 Jan 23 23:03 hello.txt ; 2d
|
---|
1392 |
|
---|
1393 | $ cat 'hello.txt ; 2d'
|
---|
1394 | 1
|
---|
1395 |
|
---|
1396 | Note that âsedâ silently ignores read/write errors in
|
---|
1397 | ârâ,âRâ,âwâ,âWâ commands (such as missing files). In the following
|
---|
1398 | example, âsedâ tries to read a file named ââhello.txt ; Nââ. The
|
---|
1399 | file is missing, and the error is silently ignored:
|
---|
1400 |
|
---|
1401 | $ echo x | sed '1rhello.txt ; N'
|
---|
1402 | x
|
---|
1403 |
|
---|
1404 | âeâ (command execution)
|
---|
1405 |
|
---|
1406 | Any characters following the âeâ command until the end of the line
|
---|
1407 | will be sent to the shell. If whitespace, comments or semicolons
|
---|
1408 | are found, they will be included in the shell command, leading to
|
---|
1409 | unexpected results:
|
---|
1410 |
|
---|
1411 | $ echo a | sed '1e touch foo#bar'
|
---|
1412 | a
|
---|
1413 |
|
---|
1414 | $ ls -1
|
---|
1415 | foo#bar
|
---|
1416 |
|
---|
1417 | $ echo a | sed '1e touch foo ; s/a/b/'
|
---|
1418 | sh: 1: s/a/b/: not found
|
---|
1419 | a
|
---|
1420 |
|
---|
1421 | âs///[we]â (substitute with âeâ or âwâ flags)
|
---|
1422 |
|
---|
1423 | In a substitution command, the âwâ flag writes the substitution
|
---|
1424 | result to a file, and the âeâ flag executes the substitution result
|
---|
1425 | as a shell command. As with the âr/R/w/W/eâ commands, these must
|
---|
1426 | be terminated with a newline. If whitespace, comments or
|
---|
1427 | semicolons are found, they will be included in the shell command or
|
---|
1428 | filename, leading to unexpected results:
|
---|
1429 |
|
---|
1430 | $ echo a | sed 's/a/b/w1.txt#foo'
|
---|
1431 | b
|
---|
1432 |
|
---|
1433 | $ ls -1
|
---|
1434 | 1.txt#foo
|
---|
1435 |
|
---|
1436 |
|
---|
1437 | File: sed.info, Node: sed addresses, Next: sed regular expressions, Prev: sed scripts, Up: Top
|
---|
1438 |
|
---|
1439 | 4 Addresses: selecting lines
|
---|
1440 | ****************************
|
---|
1441 |
|
---|
1442 | * Menu:
|
---|
1443 |
|
---|
1444 | * Addresses overview:: Addresses overview
|
---|
1445 | * Numeric Addresses:: selecting lines by numbers
|
---|
1446 | * Regexp Addresses:: selecting lines by text matching
|
---|
1447 | * Range Addresses:: selecting a range of lines
|
---|
1448 | * Zero Address:: Using address â0â
|
---|
1449 |
|
---|
1450 |
|
---|
1451 | File: sed.info, Node: Addresses overview, Next: Numeric Addresses, Up: sed addresses
|
---|
1452 |
|
---|
1453 | 4.1 Addresses overview
|
---|
1454 | ======================
|
---|
1455 |
|
---|
1456 | Addresses determine on which line(s) the âsedâ command will be executed.
|
---|
1457 | The following command replaces any first occurrence of âhelloâ with
|
---|
1458 | âworldâ only on line 144:
|
---|
1459 |
|
---|
1460 | sed '144s/hello/world/' input.txt > output.txt
|
---|
1461 |
|
---|
1462 | If no address is specified, the command is performed on all lines.
|
---|
1463 | The following command replaces âhelloâ with âworldâ, targeting every
|
---|
1464 | line of the input file. However, note that it modifies only the first
|
---|
1465 | instance of âhelloâ on each line. Use the âgâ modifier to affect every
|
---|
1466 | instance on each affected line.
|
---|
1467 |
|
---|
1468 | sed 's/hello/world/' input.txt > output.txt
|
---|
1469 |
|
---|
1470 | Addresses can contain regular expressions to match lines based on
|
---|
1471 | content instead of line numbers. The following command replaces âhelloâ
|
---|
1472 | with âworldâ only on lines containing the string âappleâ:
|
---|
1473 |
|
---|
1474 | sed '/apple/s/hello/world/' input.txt > output.txt
|
---|
1475 |
|
---|
1476 | An address range is specified with two addresses separated by a comma
|
---|
1477 | (â,â). Addresses can be numeric, regular expressions, or a mix of both.
|
---|
1478 | The following command replaces âhelloâ with âworldâ only on lines 4 to
|
---|
1479 | 17 (inclusive):
|
---|
1480 |
|
---|
1481 | sed '4,17s/hello/world/' input.txt > output.txt
|
---|
1482 |
|
---|
1483 | Appending the â!â character to the end of an address specification
|
---|
1484 | (before the command letter) negates the sense of the match. That is, if
|
---|
1485 | the â!â character follows an address or an address range, then only
|
---|
1486 | lines which do _not_ match the addresses will be selected. The
|
---|
1487 | following command replaces âhelloâ with âworldâ only on lines _not_
|
---|
1488 | containing the string âappleâ:
|
---|
1489 |
|
---|
1490 | sed '/apple/!s/hello/world/' input.txt > output.txt
|
---|
1491 |
|
---|
1492 | The following command replaces âhelloâ with âworldâ only on lines 1
|
---|
1493 | to 3 and from line 18 to the last line of the input file (i.e.
|
---|
1494 | excluding lines 4 to 17):
|
---|
1495 |
|
---|
1496 | sed '4,17!s/hello/world/' input.txt > output.txt
|
---|
1497 |
|
---|
1498 |
|
---|
1499 | File: sed.info, Node: Numeric Addresses, Next: Regexp Addresses, Prev: Addresses overview, Up: sed addresses
|
---|
1500 |
|
---|
1501 | 4.2 Selecting lines by numbers
|
---|
1502 | ==============================
|
---|
1503 |
|
---|
1504 | Addresses in a âsedâ script can be in any of the following forms:
|
---|
1505 | âNUMBERâ
|
---|
1506 | Specifying a line number will match only that line in the input.
|
---|
1507 | (Note that âsedâ counts lines continuously across all input files
|
---|
1508 | unless â-iâ or â-sâ options are specified.)
|
---|
1509 |
|
---|
1510 | â$â
|
---|
1511 | This address matches the last line of the last file of input, or
|
---|
1512 | the last line of each file when the â-iâ or â-sâ options are
|
---|
1513 | specified.
|
---|
1514 |
|
---|
1515 | âFIRST~STEPâ
|
---|
1516 | This GNU extension matches every STEPth line starting with line
|
---|
1517 | FIRST. In particular, lines will be selected when there exists a
|
---|
1518 | non-negative N such that the current line-number equals FIRST + (N
|
---|
1519 | * STEP). Thus, one would use â1~2â to select the odd-numbered
|
---|
1520 | lines and â0~2â for even-numbered lines; to pick every third line
|
---|
1521 | starting with the second, â2~3â would be used; to pick every fifth
|
---|
1522 | line starting with the tenth, use â10~5â; and â50~0â is just an
|
---|
1523 | obscure way of saying â50â.
|
---|
1524 |
|
---|
1525 | The following commands demonstrate the step address usage:
|
---|
1526 |
|
---|
1527 | $ seq 10 | sed -n '0~4p'
|
---|
1528 | 4
|
---|
1529 | 8
|
---|
1530 |
|
---|
1531 | $ seq 10 | sed -n '1~3p'
|
---|
1532 | 1
|
---|
1533 | 4
|
---|
1534 | 7
|
---|
1535 | 10
|
---|
1536 |
|
---|
1537 |
|
---|
1538 | File: sed.info, Node: Regexp Addresses, Next: Range Addresses, Prev: Numeric Addresses, Up: sed addresses
|
---|
1539 |
|
---|
1540 | 4.3 selecting lines by text matching
|
---|
1541 | ====================================
|
---|
1542 |
|
---|
1543 | GNU âsedâ supports the following regular expression addresses. The
|
---|
1544 | default regular expression is *note Basic Regular Expression (BRE): BRE
|
---|
1545 | syntax. If â-Eâ or â-râ options are used, The regular expression should
|
---|
1546 | be in *note Extended Regular Expression (ERE): ERE syntax. syntax.
|
---|
1547 | *Note BRE vs ERE::.
|
---|
1548 |
|
---|
1549 | â/REGEXP/â
|
---|
1550 | This will select any line which matches the regular expression
|
---|
1551 | REGEXP. If REGEXP itself includes any â/â characters, each must be
|
---|
1552 | escaped by a backslash (â\â).
|
---|
1553 |
|
---|
1554 | The following command prints lines in â/etc/passwdâ which end with
|
---|
1555 | âbashâ(1):
|
---|
1556 |
|
---|
1557 | sed -n '/bash$/p' /etc/passwd
|
---|
1558 |
|
---|
1559 | The empty regular expression â//â repeats the last regular
|
---|
1560 | expression match (the same holds if the empty regular expression is
|
---|
1561 | passed to the âsâ command). Note that modifiers to regular
|
---|
1562 | expressions are evaluated when the regular expression is compiled,
|
---|
1563 | thus it is invalid to specify them together with the empty regular
|
---|
1564 | expression.
|
---|
1565 |
|
---|
1566 | â\%REGEXP%â
|
---|
1567 | (The â%â may be replaced by any other single character.)
|
---|
1568 |
|
---|
1569 | This also matches the regular expression REGEXP, but allows one to
|
---|
1570 | use a different delimiter than â/â. This is particularly useful if
|
---|
1571 | the REGEXP itself contains a lot of slashes, since it avoids the
|
---|
1572 | tedious escaping of every â/â. If REGEXP itself includes any
|
---|
1573 | delimiter characters, each must be escaped by a backslash (â\â).
|
---|
1574 |
|
---|
1575 | The following commands are equivalent. They print lines which
|
---|
1576 | start with â/home/alice/documents/â:
|
---|
1577 |
|
---|
1578 | sed -n '/^\/home\/alice\/documents\//p'
|
---|
1579 | sed -n '\%^/home/alice/documents/%p'
|
---|
1580 | sed -n '\;^/home/alice/documents/;p'
|
---|
1581 |
|
---|
1582 | â/REGEXP/Iâ
|
---|
1583 | â\%REGEXP%Iâ
|
---|
1584 | The âIâ modifier to regular-expression matching is a GNU extension
|
---|
1585 | which causes the REGEXP to be matched in a case-insensitive manner.
|
---|
1586 |
|
---|
1587 | In many other programming languages, a lower case âiâ is used for
|
---|
1588 | case-insensitive regular expression matching. However, in âsedâ
|
---|
1589 | the âiâ is used for the insert command (*note insert command::).
|
---|
1590 |
|
---|
1591 | Observe the difference between the following examples.
|
---|
1592 |
|
---|
1593 | In this example, â/b/Iâ is the address: regular expression with âIâ
|
---|
1594 | modifier. âdâ is the delete command:
|
---|
1595 |
|
---|
1596 | $ printf "%s\n" a b c | sed '/b/Id'
|
---|
1597 | a
|
---|
1598 | c
|
---|
1599 |
|
---|
1600 | Here, â/b/â is the address: a regular expression. âiâ is the
|
---|
1601 | insert command. âdâ is the value to insert. A line with âdâ is
|
---|
1602 | then inserted above the matched line:
|
---|
1603 |
|
---|
1604 | $ printf "%s\n" a b c | sed '/b/id'
|
---|
1605 | a
|
---|
1606 | d
|
---|
1607 | b
|
---|
1608 | c
|
---|
1609 |
|
---|
1610 | â/REGEXP/Mâ
|
---|
1611 | â\%REGEXP%Mâ
|
---|
1612 | The âMâ modifier to regular-expression matching is a GNU âsedâ
|
---|
1613 | extension which directs GNU âsedâ to match the regular expression
|
---|
1614 | in âmulti-lineâ mode. The modifier causes â^â and â$â to match
|
---|
1615 | respectively (in addition to the normal behavior) the empty string
|
---|
1616 | after a newline, and the empty string before a newline. There are
|
---|
1617 | special character sequences (â\`â and â\'â) which always match the
|
---|
1618 | beginning or the end of the buffer. In addition, the period
|
---|
1619 | character does not match a new-line character in multi-line mode.
|
---|
1620 |
|
---|
1621 | Regex addresses operate on the content of the current pattern space.
|
---|
1622 | If the pattern space is changed (for example with âs///â command) the
|
---|
1623 | regular expression matching will operate on the changed text.
|
---|
1624 |
|
---|
1625 | In the following example, automatic printing is disabled with â-nâ.
|
---|
1626 | The âs/2/X/â command changes lines containing â2â to âXâ. The command
|
---|
1627 | â/[0-9]/pâ matches lines with digits and prints them. Because the
|
---|
1628 | second line is changed before the â/[0-9]/â regex, it will not match and
|
---|
1629 | will not be printed:
|
---|
1630 |
|
---|
1631 | $ seq 3 | sed -n 's/2/X/ ; /[0-9]/p'
|
---|
1632 | 1
|
---|
1633 | 3
|
---|
1634 |
|
---|
1635 | ---------- Footnotes ----------
|
---|
1636 |
|
---|
1637 | (1) There are of course many other ways to do the same, e.g.
|
---|
1638 | grep 'bash$' /etc/passwd
|
---|
1639 | awk -F: '$7 == "/bin/bash"' /etc/passwd
|
---|
1640 |
|
---|
1641 |
|
---|
1642 | File: sed.info, Node: Range Addresses, Next: Zero Address, Prev: Regexp Addresses, Up: sed addresses
|
---|
1643 |
|
---|
1644 | 4.4 Range Addresses
|
---|
1645 | ===================
|
---|
1646 |
|
---|
1647 | An address range can be specified by specifying two addresses separated
|
---|
1648 | by a comma (â,â). An address range matches lines starting from where
|
---|
1649 | the first address matches, and continues until the second address
|
---|
1650 | matches (inclusively):
|
---|
1651 |
|
---|
1652 | $ seq 10 | sed -n '4,6p'
|
---|
1653 | 4
|
---|
1654 | 5
|
---|
1655 | 6
|
---|
1656 |
|
---|
1657 | If the second address is a REGEXP, then checking for the ending match
|
---|
1658 | will start with the line _following_ the line which matched the first
|
---|
1659 | address: a range will always span at least two lines (except of course
|
---|
1660 | if the input stream ends).
|
---|
1661 |
|
---|
1662 | $ seq 10 | sed -n '4,/[0-9]/p'
|
---|
1663 | 4
|
---|
1664 | 5
|
---|
1665 |
|
---|
1666 | If the second address is a NUMBER less than (or equal to) the line
|
---|
1667 | matching the first address, then only the one line is matched:
|
---|
1668 |
|
---|
1669 | $ seq 10 | sed -n '4,1p'
|
---|
1670 | 4
|
---|
1671 |
|
---|
1672 | GNU âsedâ also supports some special two-address forms; all these are
|
---|
1673 | GNU extensions:
|
---|
1674 | â0,/REGEXP/â
|
---|
1675 | A line number of â0â can be used in an address specification like
|
---|
1676 | â0,/REGEXP/â so that âsedâ will try to match REGEXP in the first
|
---|
1677 | input line too. In other words, â0,/REGEXP/â is similar to
|
---|
1678 | â1,/REGEXP/â, except that if ADDR2 matches the very first line of
|
---|
1679 | input the â0,/REGEXP/â form will consider it to end the range,
|
---|
1680 | whereas the â1,/REGEXP/â form will match the beginning of its range
|
---|
1681 | and hence make the range span up to the _second_ occurrence of the
|
---|
1682 | regular expression.
|
---|
1683 |
|
---|
1684 | The following examples demonstrate the difference between starting
|
---|
1685 | with address 1 and 0:
|
---|
1686 |
|
---|
1687 | $ seq 10 | sed -n '1,/[0-9]/p'
|
---|
1688 | 1
|
---|
1689 | 2
|
---|
1690 |
|
---|
1691 | $ seq 10 | sed -n '0,/[0-9]/p'
|
---|
1692 | 1
|
---|
1693 |
|
---|
1694 | âADDR1,+Nâ
|
---|
1695 | Matches ADDR1 and the N lines following ADDR1.
|
---|
1696 |
|
---|
1697 | $ seq 10 | sed -n '6,+2p'
|
---|
1698 | 6
|
---|
1699 | 7
|
---|
1700 | 8
|
---|
1701 |
|
---|
1702 | ADDR1 can be a line number or a regular expression.
|
---|
1703 |
|
---|
1704 | âADDR1,~Nâ
|
---|
1705 | Matches ADDR1 and the lines following ADDR1 until the next line
|
---|
1706 | whose input line number is a multiple of N. The following command
|
---|
1707 | prints starting at line 6, until the next line which is a multiple
|
---|
1708 | of 4 (i.e. line 8):
|
---|
1709 |
|
---|
1710 | $ seq 10 | sed -n '6,~4p'
|
---|
1711 | 6
|
---|
1712 | 7
|
---|
1713 | 8
|
---|
1714 |
|
---|
1715 | ADDR1 can be a line number or a regular expression.
|
---|
1716 |
|
---|
1717 |
|
---|
1718 | File: sed.info, Node: Zero Address, Prev: Range Addresses, Up: sed addresses
|
---|
1719 |
|
---|
1720 | 4.5 Zero Address
|
---|
1721 | ================
|
---|
1722 |
|
---|
1723 | As a GNU âsedâ extension, â0â address can be used in two cases:
|
---|
1724 | 1. In a regex range addresses as â0,/REGEXP/â (*note Zero Address
|
---|
1725 | Regex Range::).
|
---|
1726 | 2. With the ârâ command, inserting a file before the first line (*note
|
---|
1727 | Adding a header to multiple files::).
|
---|
1728 |
|
---|
1729 | Note that these are the only places where the â0â address makes
|
---|
1730 | sense; Commands which are given the â0â address in any other way will
|
---|
1731 | give an error.
|
---|
1732 |
|
---|
1733 |
|
---|
1734 | File: sed.info, Node: sed regular expressions, Next: advanced sed, Prev: sed addresses, Up: Top
|
---|
1735 |
|
---|
1736 | 5 Regular Expressions: selecting text
|
---|
1737 | *************************************
|
---|
1738 |
|
---|
1739 | * Menu:
|
---|
1740 |
|
---|
1741 | * Regular Expressions Overview:: Overview of Regular expression in âsedâ
|
---|
1742 | * BRE vs ERE:: Basic (BRE) and extended (ERE) regular expression
|
---|
1743 | syntax
|
---|
1744 | * BRE syntax:: Overview of basic regular expression syntax
|
---|
1745 | * ERE syntax:: Overview of extended regular expression syntax
|
---|
1746 | * Character Classes and Bracket Expressions::
|
---|
1747 | * regexp extensions:: Additional regular expression commands
|
---|
1748 | * Back-references and Subexpressions:: Back-references and Subexpressions
|
---|
1749 | * Escapes:: Specifying special characters
|
---|
1750 | * Locale Considerations:: Multibyte characters and locale considerations
|
---|
1751 |
|
---|
1752 |
|
---|
1753 | File: sed.info, Node: Regular Expressions Overview, Next: BRE vs ERE, Up: sed regular expressions
|
---|
1754 |
|
---|
1755 | 5.1 Overview of regular expression in âsedâ
|
---|
1756 | ===========================================
|
---|
1757 |
|
---|
1758 | To know how to use âsedâ, people should understand regular expressions
|
---|
1759 | (âregexpâ for short). A regular expression is a pattern that is matched
|
---|
1760 | against a subject string from left to right. Most characters are
|
---|
1761 | âordinaryâ: they stand for themselves in a pattern, and match the
|
---|
1762 | corresponding characters. Regular expressions in âsedâ are specified
|
---|
1763 | between two slashes.
|
---|
1764 |
|
---|
1765 | The following command prints lines containing the string âhelloâ:
|
---|
1766 |
|
---|
1767 | sed -n '/hello/p'
|
---|
1768 |
|
---|
1769 | The above example is equivalent to this âgrepâ command:
|
---|
1770 |
|
---|
1771 | grep 'hello'
|
---|
1772 |
|
---|
1773 | The power of regular expressions comes from the ability to include
|
---|
1774 | alternatives and repetitions in the pattern. These are encoded in the
|
---|
1775 | pattern by the use of âspecial charactersâ, which do not stand for
|
---|
1776 | themselves but instead are interpreted in some special way.
|
---|
1777 |
|
---|
1778 | The character â^â (caret) in a regular expression matches the
|
---|
1779 | beginning of the line. The character â.â (dot) matches any single
|
---|
1780 | character. The following âsedâ command matches and prints lines which
|
---|
1781 | start with the letter âbâ, followed by any single character, followed by
|
---|
1782 | the letter âdâ:
|
---|
1783 |
|
---|
1784 | $ printf "%s\n" abode bad bed bit bid byte body | sed -n '/^b.d/p'
|
---|
1785 | bad
|
---|
1786 | bed
|
---|
1787 | bid
|
---|
1788 | body
|
---|
1789 |
|
---|
1790 | The following sections explain the meaning and usage of special
|
---|
1791 | characters in regular expressions.
|
---|
1792 |
|
---|
1793 |
|
---|
1794 | File: sed.info, Node: BRE vs ERE, Next: BRE syntax, Prev: Regular Expressions Overview, Up: sed regular expressions
|
---|
1795 |
|
---|
1796 | 5.2 Basic (BRE) and extended (ERE) regular expression
|
---|
1797 | =====================================================
|
---|
1798 |
|
---|
1799 | Basic and extended regular expressions are two variations on the syntax
|
---|
1800 | of the specified pattern. Basic Regular Expression (BRE) syntax is the
|
---|
1801 | default in âsedâ (and similarly in âgrepâ). Use the POSIX-specified
|
---|
1802 | â-Eâ option (â-râ, â--regexp-extendedâ) to enable Extended Regular
|
---|
1803 | Expression (ERE) syntax.
|
---|
1804 |
|
---|
1805 | In GNU âsedâ, the only difference between basic and extended regular
|
---|
1806 | expressions is in the behavior of a few special characters: â?â, â+â,
|
---|
1807 | parentheses, braces (â{}â), and â|â.
|
---|
1808 |
|
---|
1809 | With basic (BRE) syntax, these characters do not have special meaning
|
---|
1810 | unless prefixed with a backslash (â\â); While with extended (ERE) syntax
|
---|
1811 | it is reversed: these characters are special unless they are prefixed
|
---|
1812 | with backslash (â\â).
|
---|
1813 |
|
---|
1814 | Desired pattern Basic (BRE) Syntax Extended (ERE) Syntax
|
---|
1815 |
|
---|
1816 | --------------------------------------------------------------------------
|
---|
1817 | literal â+â (plus $ echo 'a+b=c' > foo $ echo 'a+b=c' > foo
|
---|
1818 | sign) $ sed -n '/a+b/p' foo $ sed -E -n '/a\+b/p' foo
|
---|
1819 | a+b=c a+b=c
|
---|
1820 |
|
---|
1821 | One or more âaâ $ echo aab > foo $ echo aab > foo
|
---|
1822 | characters $ sed -n '/a\+b/p' foo $ sed -E -n '/a+b/p' foo
|
---|
1823 | followed by âbâ aab aab
|
---|
1824 | (plus sign as
|
---|
1825 | special
|
---|
1826 | meta-character)
|
---|
1827 |
|
---|
1828 |
|
---|
1829 | File: sed.info, Node: BRE syntax, Next: ERE syntax, Prev: BRE vs ERE, Up: sed regular expressions
|
---|
1830 |
|
---|
1831 | 5.3 Overview of basic regular expression syntax
|
---|
1832 | ===============================================
|
---|
1833 |
|
---|
1834 | Here is a brief description of regular expression syntax as used in
|
---|
1835 | âsedâ.
|
---|
1836 |
|
---|
1837 | âCHARâ
|
---|
1838 | A single ordinary character matches itself.
|
---|
1839 |
|
---|
1840 | â*â
|
---|
1841 | Matches a sequence of zero or more instances of matches for the
|
---|
1842 | preceding regular expression, which must be an ordinary character,
|
---|
1843 | a special character preceded by â\â, a â.â, a grouped regexp (see
|
---|
1844 | below), or a bracket expression. As a GNU extension, a postfixed
|
---|
1845 | regular expression can also be followed by â*â; for example, âa**â
|
---|
1846 | is equivalent to âa*â. POSIX 1003.1-2001 says that â*â stands for
|
---|
1847 | itself when it appears at the start of a regular expression or
|
---|
1848 | subexpression, but many non-GNU implementations do not support this
|
---|
1849 | and portable scripts should instead use â\*â in these contexts.
|
---|
1850 | â.â
|
---|
1851 | Matches any character, including newline.
|
---|
1852 |
|
---|
1853 | â^â
|
---|
1854 | Matches the null string at beginning of the pattern space, i.e.
|
---|
1855 | what appears after the circumflex must appear at the beginning of
|
---|
1856 | the pattern space.
|
---|
1857 |
|
---|
1858 | In most scripts, pattern space is initialized to the content of
|
---|
1859 | each line (*note How âsedâ works: Execution Cycle.). So, it is a
|
---|
1860 | useful simplification to think of â^#includeâ as matching only
|
---|
1861 | lines where â#includeâ is the first thing on the lineâif there is
|
---|
1862 | any preceding space, for example, the match fails. This
|
---|
1863 | simplification is valid as long as the original content of pattern
|
---|
1864 | space is not modified, for example with an âsâ command.
|
---|
1865 |
|
---|
1866 | â^â acts as a special character only at the beginning of the
|
---|
1867 | regular expression or subexpression (that is, after â\(â or â\|â).
|
---|
1868 | Portable scripts should avoid â^â at the beginning of a
|
---|
1869 | subexpression, though, as POSIX allows implementations that treat
|
---|
1870 | â^â as an ordinary character in that context.
|
---|
1871 |
|
---|
1872 | â$â
|
---|
1873 | It is the same as â^â, but refers to end of pattern space. â$â
|
---|
1874 | also acts as a special character only at the end of the regular
|
---|
1875 | expression or subexpression (that is, before â\)â or â\|â), and its
|
---|
1876 | use at the end of a subexpression is not portable.
|
---|
1877 |
|
---|
1878 | â[LIST]â
|
---|
1879 | â[^LIST]â
|
---|
1880 | Matches any single character in LIST: for example, â[aeiou]â
|
---|
1881 | matches all vowels. A list may include sequences like
|
---|
1882 | âCHAR1-CHAR2â, which matches any character between (inclusive)
|
---|
1883 | CHAR1 and CHAR2. *Note Character Classes and Bracket
|
---|
1884 | Expressions::.
|
---|
1885 |
|
---|
1886 | â\+â
|
---|
1887 | As â*â, but matches one or more. It is a GNU extension.
|
---|
1888 |
|
---|
1889 | â\?â
|
---|
1890 | As â*â, but only matches zero or one. It is a GNU extension.
|
---|
1891 |
|
---|
1892 | â\{I\}â
|
---|
1893 | As â*â, but matches exactly I sequences (I is a decimal integer;
|
---|
1894 | for portability, keep it between 0 and 255 inclusive).
|
---|
1895 |
|
---|
1896 | â\{I,J\}â
|
---|
1897 | Matches between I and J, inclusive, sequences.
|
---|
1898 |
|
---|
1899 | â\{I,\}â
|
---|
1900 | Matches more than or equal to I sequences.
|
---|
1901 |
|
---|
1902 | â\(REGEXP\)â
|
---|
1903 | Groups the inner REGEXP as a whole, this is used to:
|
---|
1904 |
|
---|
1905 | ⢠Apply postfix operators, like â\(abcd\)*â: this will search
|
---|
1906 | for zero or more whole sequences of âabcdâ, while âabcd*â
|
---|
1907 | would search for âabcâ followed by zero or more occurrences of
|
---|
1908 | âdâ. Note that support for â\(abcd\)*â is required by POSIX
|
---|
1909 | 1003.1-2001, but many non-GNU implementations do not support
|
---|
1910 | it and hence it is not universally portable.
|
---|
1911 |
|
---|
1912 | ⢠Use back references (see below).
|
---|
1913 |
|
---|
1914 | âREGEXP1\|REGEXP2â
|
---|
1915 | Matches either REGEXP1 or REGEXP2. Use parentheses to use complex
|
---|
1916 | alternative regular expressions. The matching process tries each
|
---|
1917 | alternative in turn, from left to right, and the first one that
|
---|
1918 | succeeds is used. It is a GNU extension.
|
---|
1919 |
|
---|
1920 | âREGEXP1REGEXP2â
|
---|
1921 | Matches the concatenation of REGEXP1 and REGEXP2. Concatenation
|
---|
1922 | binds more tightly than â\|â, â^â, and â$â, but less tightly than
|
---|
1923 | the other regular expression operators.
|
---|
1924 |
|
---|
1925 | â\DIGITâ
|
---|
1926 | Matches the DIGIT-th â\(...\)â parenthesized subexpression in the
|
---|
1927 | regular expression. This is called a âback referenceâ.
|
---|
1928 | Subexpressions are implicitly numbered by counting occurrences of
|
---|
1929 | â\(â left-to-right.
|
---|
1930 |
|
---|
1931 | â\nâ
|
---|
1932 | Matches the newline character.
|
---|
1933 |
|
---|
1934 | â\CHARâ
|
---|
1935 | Matches CHAR, where CHAR is one of â$â, â*â, â.â, â[â, â\â, or â^â.
|
---|
1936 | Note that the only C-like backslash sequences that you can portably
|
---|
1937 | assume to be interpreted are â\nâ and â\\â; in particular â\tâ is
|
---|
1938 | not portable, and matches a âtâ under most implementations of
|
---|
1939 | âsedâ, rather than a tab character.
|
---|
1940 |
|
---|
1941 | Note that the regular expression matcher is greedy, i.e., matches are
|
---|
1942 | attempted from left to right and, if two or more matches are possible
|
---|
1943 | starting at the same character, it selects the longest.
|
---|
1944 |
|
---|
1945 | Examples:
|
---|
1946 | âabcdefâ
|
---|
1947 | Matches âabcdefâ.
|
---|
1948 |
|
---|
1949 | âa*bâ
|
---|
1950 | Matches zero or more âaâs followed by a single âbâ. For example,
|
---|
1951 | âbâ or âaaaaabâ.
|
---|
1952 |
|
---|
1953 | âa\?bâ
|
---|
1954 | Matches âbâ or âabâ.
|
---|
1955 |
|
---|
1956 | âa\+b\+â
|
---|
1957 | Matches one or more âaâs followed by one or more âbâs: âabâ is the
|
---|
1958 | shortest possible match, but other examples are âaaaabâ or âabbbbbâ
|
---|
1959 | or âaaaaaabbbbbbbâ.
|
---|
1960 |
|
---|
1961 | â.*â
|
---|
1962 | â.\+â
|
---|
1963 | These two both match all the characters in a string; however, the
|
---|
1964 | first matches every string (including the empty string), while the
|
---|
1965 | second matches only strings containing at least one character.
|
---|
1966 |
|
---|
1967 | â^main.*(.*)â
|
---|
1968 | This matches a string starting with âmainâ, followed by an opening
|
---|
1969 | and closing parenthesis. The ânâ, â(â and â)â need not be
|
---|
1970 | adjacent.
|
---|
1971 |
|
---|
1972 | â^#â
|
---|
1973 | This matches a string beginning with â#â.
|
---|
1974 |
|
---|
1975 | â\\$â
|
---|
1976 | This matches a string ending with a single backslash. The regexp
|
---|
1977 | contains two backslashes for escaping.
|
---|
1978 |
|
---|
1979 | â\$â
|
---|
1980 | Instead, this matches a string consisting of a single dollar sign,
|
---|
1981 | because it is escaped.
|
---|
1982 |
|
---|
1983 | â[a-zA-Z0-9]â
|
---|
1984 | In the C locale, this matches any ASCII letters or digits.
|
---|
1985 |
|
---|
1986 | â[^ â<TAB>â]\+â
|
---|
1987 | (Here â<TAB>â stands for a single tab character.) This matches a
|
---|
1988 | string of one or more characters, none of which is a space or a
|
---|
1989 | tab. Usually this means a word.
|
---|
1990 |
|
---|
1991 | â^\(.*\)\n\1$â
|
---|
1992 | This matches a string consisting of two equal substrings separated
|
---|
1993 | by a newline.
|
---|
1994 |
|
---|
1995 | â.\{9\}A$â
|
---|
1996 | This matches nine characters followed by an âAâ at the end of a
|
---|
1997 | line.
|
---|
1998 |
|
---|
1999 | â^.\{15\}Aâ
|
---|
2000 | This matches the start of a string that contains 16 characters, the
|
---|
2001 | last of which is an âAâ.
|
---|
2002 |
|
---|
2003 |
|
---|
2004 | File: sed.info, Node: ERE syntax, Next: Character Classes and Bracket Expressions, Prev: BRE syntax, Up: sed regular expressions
|
---|
2005 |
|
---|
2006 | 5.4 Overview of extended regular expression syntax
|
---|
2007 | ==================================================
|
---|
2008 |
|
---|
2009 | The only difference between basic and extended regular expressions is in
|
---|
2010 | the behavior of a few characters: â?â, â+â, parentheses, braces (â{}â),
|
---|
2011 | and â|â. While basic regular expressions require these to be escaped if
|
---|
2012 | you want them to behave as special characters, when using extended
|
---|
2013 | regular expressions you must escape them if you want them _to match a
|
---|
2014 | literal character_. â|â is special here because â\|â is a GNU extension
|
---|
2015 | â standard basic regular expressions do not provide its functionality.
|
---|
2016 |
|
---|
2017 | Examples:
|
---|
2018 | âabc?â
|
---|
2019 | becomes âabc\?â when using extended regular expressions. It
|
---|
2020 | matches the literal string âabc?â.
|
---|
2021 |
|
---|
2022 | âc\+â
|
---|
2023 | becomes âc+â when using extended regular expressions. It matches
|
---|
2024 | one or more âcâs.
|
---|
2025 |
|
---|
2026 | âa\{3,\}â
|
---|
2027 | becomes âa{3,}â when using extended regular expressions. It
|
---|
2028 | matches three or more âaâs.
|
---|
2029 |
|
---|
2030 | â\(abc\)\{2,3\}â
|
---|
2031 | becomes â(abc){2,3}â when using extended regular expressions. It
|
---|
2032 | matches either âabcabcâ or âabcabcabcâ.
|
---|
2033 |
|
---|
2034 | â\(abc*\)\1â
|
---|
2035 | becomes â(abc*)\1â when using extended regular expressions.
|
---|
2036 | Backreferences must still be escaped when using extended regular
|
---|
2037 | expressions.
|
---|
2038 |
|
---|
2039 | âa\|bâ
|
---|
2040 | becomes âa|bâ when using extended regular expressions. It matches
|
---|
2041 | âaâ or âbâ.
|
---|
2042 |
|
---|
2043 |
|
---|
2044 | File: sed.info, Node: Character Classes and Bracket Expressions, Next: regexp extensions, Prev: ERE syntax, Up: sed regular expressions
|
---|
2045 |
|
---|
2046 | 5.5 Character Classes and Bracket Expressions
|
---|
2047 | =============================================
|
---|
2048 |
|
---|
2049 | A âbracket expressionâ is a list of characters enclosed by â[â and â]â.
|
---|
2050 | It matches any single character in that list; if the first character of
|
---|
2051 | the list is the caret â^â, then it matches any character *not* in the
|
---|
2052 | list. For example, the following command replaces the strings âgrayâ or
|
---|
2053 | âgreyâ with âblueâ:
|
---|
2054 |
|
---|
2055 | sed 's/gr[ae]y/blue/'
|
---|
2056 |
|
---|
2057 | Bracket expressions can be used in both *note basic: BRE syntax. and
|
---|
2058 | *note extended: ERE syntax. regular expressions (that is, with or
|
---|
2059 | without the â-Eâ/â-râ options).
|
---|
2060 |
|
---|
2061 | Within a bracket expression, a ârange expressionâ consists of two
|
---|
2062 | characters separated by a hyphen. It matches any single character that
|
---|
2063 | sorts between the two characters, inclusive. In the default C locale,
|
---|
2064 | the sorting sequence is the native character order; for example, â[a-d]â
|
---|
2065 | is equivalent to â[abcd]â.
|
---|
2066 |
|
---|
2067 | Finally, certain named classes of characters are predefined within
|
---|
2068 | bracket expressions, as follows.
|
---|
2069 |
|
---|
2070 | These named classes must be used _inside_ brackets themselves.
|
---|
2071 | Correct usage:
|
---|
2072 | $ echo 1 | sed 's/[[:digit:]]/X/'
|
---|
2073 | X
|
---|
2074 |
|
---|
2075 | Incorrect usage is rejected by newer âsedâ versions. Older versions
|
---|
2076 | accepted it but treated it as a single bracket expression (which is
|
---|
2077 | equivalent to â[dgit:]â, that is, only the characters D/G/I/T/:):
|
---|
2078 | # current GNU sed versions - incorrect usage rejected
|
---|
2079 | $ echo 1 | sed 's/[:digit:]/X/'
|
---|
2080 | sed: character class syntax is [[:space:]], not [:space:]
|
---|
2081 |
|
---|
2082 | # older GNU sed versions
|
---|
2083 | $ echo 1 | sed 's/[:digit:]/X/'
|
---|
2084 | 1
|
---|
2085 |
|
---|
2086 | â[:alnum:]â
|
---|
2087 | Alphanumeric characters: â[:alpha:]â and â[:digit:]â; in the âCâ
|
---|
2088 | locale and ASCII character encoding, this is the same as
|
---|
2089 | â[0-9A-Za-z]â.
|
---|
2090 |
|
---|
2091 | â[:alpha:]â
|
---|
2092 | Alphabetic characters: â[:lower:]â and â[:upper:]â; in the âCâ
|
---|
2093 | locale and ASCII character encoding, this is the same as
|
---|
2094 | â[A-Za-z]â.
|
---|
2095 |
|
---|
2096 | â[:blank:]â
|
---|
2097 | Blank characters: space and tab.
|
---|
2098 |
|
---|
2099 | â[:cntrl:]â
|
---|
2100 | Control characters. In ASCII, these characters have octal codes
|
---|
2101 | 000 through 037, and 177 (DEL). In other character sets, these are
|
---|
2102 | the equivalent characters, if any.
|
---|
2103 |
|
---|
2104 | â[:digit:]â
|
---|
2105 | Digits: â0 1 2 3 4 5 6 7 8 9â.
|
---|
2106 |
|
---|
2107 | â[:graph:]â
|
---|
2108 | Graphical characters: â[:alnum:]â and â[:punct:]â.
|
---|
2109 |
|
---|
2110 | â[:lower:]â
|
---|
2111 | Lower-case letters; in the âCâ locale and ASCII character encoding,
|
---|
2112 | this is âa b c d e f g h i j k l m n o p q r s t u v w x y zâ.
|
---|
2113 |
|
---|
2114 | â[:print:]â
|
---|
2115 | Printable characters: â[:alnum:]â, â[:punct:]â, and space.
|
---|
2116 |
|
---|
2117 | â[:punct:]â
|
---|
2118 | Punctuation characters; in the âCâ locale and ASCII character
|
---|
2119 | encoding, this is â! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \
|
---|
2120 | ] ^ _ ` { | } ~â.
|
---|
2121 |
|
---|
2122 | â[:space:]â
|
---|
2123 | Space characters: in the âCâ locale, this is tab, newline, vertical
|
---|
2124 | tab, form feed, carriage return, and space.
|
---|
2125 |
|
---|
2126 | â[:upper:]â
|
---|
2127 | Upper-case letters: in the âCâ locale and ASCII character encoding,
|
---|
2128 | this is âA B C D E F G H I J K L M N O P Q R S T U V W X Y Zâ.
|
---|
2129 |
|
---|
2130 | â[:xdigit:]â
|
---|
2131 | Hexadecimal digits: â0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e fâ.
|
---|
2132 |
|
---|
2133 | Note that the brackets in these class names are part of the symbolic
|
---|
2134 | names, and must be included in addition to the brackets delimiting the
|
---|
2135 | bracket expression.
|
---|
2136 |
|
---|
2137 | Most meta-characters lose their special meaning inside bracket
|
---|
2138 | expressions:
|
---|
2139 |
|
---|
2140 | â]â
|
---|
2141 | ends the bracket expression if itâs not the first list item. So,
|
---|
2142 | if you want to make the â]â character a list item, you must put it
|
---|
2143 | first.
|
---|
2144 |
|
---|
2145 | â-â
|
---|
2146 | represents the range if itâs not first or last in a list or the
|
---|
2147 | ending point of a range.
|
---|
2148 |
|
---|
2149 | â^â
|
---|
2150 | represents the characters not in the list. If you want to make the
|
---|
2151 | â^â character a list item, place it anywhere but first.
|
---|
2152 |
|
---|
2153 | TODO: incorporate this paragraph (copied verbatim from BRE section).
|
---|
2154 |
|
---|
2155 | The characters â$â, â*â, â.â, â[â, and â\â are normally not special
|
---|
2156 | within LIST. For example, â[\*]â matches either â\â or â*â, because the
|
---|
2157 | â\â is not special here. However, strings like â[.ch.]â, â[=a=]â, and
|
---|
2158 | â[:space:]â are special within LIST and represent collating symbols,
|
---|
2159 | equivalence classes, and character classes, respectively, and â[â is
|
---|
2160 | therefore special within LIST when it is followed by â.â, â=â, or â:â.
|
---|
2161 | Also, when not in âPOSIXLY_CORRECTâ mode, special escapes like â\nâ and
|
---|
2162 | â\tâ are recognized within LIST. *Note Escapes::.
|
---|
2163 |
|
---|
2164 | â[.â
|
---|
2165 | represents the open collating symbol.
|
---|
2166 |
|
---|
2167 | â.]â
|
---|
2168 | represents the close collating symbol.
|
---|
2169 |
|
---|
2170 | â[=â
|
---|
2171 | represents the open equivalence class.
|
---|
2172 |
|
---|
2173 | â=]â
|
---|
2174 | represents the close equivalence class.
|
---|
2175 |
|
---|
2176 | â[:â
|
---|
2177 | represents the open character class symbol, and should be followed
|
---|
2178 | by a valid character class name.
|
---|
2179 |
|
---|
2180 | â:]â
|
---|
2181 | represents the close character class symbol.
|
---|
2182 |
|
---|
2183 |
|
---|
2184 | File: sed.info, Node: regexp extensions, Next: Back-references and Subexpressions, Prev: Character Classes and Bracket Expressions, Up: sed regular expressions
|
---|
2185 |
|
---|
2186 | 5.6 regular expression extensions
|
---|
2187 | =================================
|
---|
2188 |
|
---|
2189 | The following sequences have special meaning inside regular expressions
|
---|
2190 | (used in *note addresses: Regexp Addresses. and the âsâ command).
|
---|
2191 |
|
---|
2192 | These can be used in both *note basic: BRE syntax. and *note
|
---|
2193 | extended: ERE syntax. regular expressions (that is, with or without the
|
---|
2194 | â-Eâ/â-râ options).
|
---|
2195 |
|
---|
2196 | â\wâ
|
---|
2197 | Matches any âwordâ character. A âwordâ character is any letter or
|
---|
2198 | digit or the underscore character.
|
---|
2199 |
|
---|
2200 | $ echo "abc %-= def." | sed 's/\w/X/g'
|
---|
2201 | XXX %-= XXX.
|
---|
2202 |
|
---|
2203 | â\Wâ
|
---|
2204 | Matches any ânon-wordâ character.
|
---|
2205 |
|
---|
2206 | $ echo "abc %-= def." | sed 's/\W/X/g'
|
---|
2207 | abcXXXXXdefX
|
---|
2208 |
|
---|
2209 | â\bâ
|
---|
2210 | Matches a word boundary; that is it matches if the character to the
|
---|
2211 | left is a âwordâ character and the character to the right is a
|
---|
2212 | ânon-wordâ character, or vice-versa.
|
---|
2213 |
|
---|
2214 | $ echo "abc %-= def." | sed 's/\b/X/g'
|
---|
2215 | XabcX %-= XdefX.
|
---|
2216 |
|
---|
2217 | â\Bâ
|
---|
2218 | Matches everywhere but on a word boundary; that is it matches if
|
---|
2219 | the character to the left and the character to the right are either
|
---|
2220 | both âwordâ characters or both ânon-wordâ characters.
|
---|
2221 |
|
---|
2222 | $ echo "abc %-= def." | sed 's/\B/X/g'
|
---|
2223 | aXbXc X%X-X=X dXeXf.X
|
---|
2224 |
|
---|
2225 | â\sâ
|
---|
2226 | Matches whitespace characters (spaces and tabs). Newlines embedded
|
---|
2227 | in the pattern/hold spaces will also match:
|
---|
2228 |
|
---|
2229 | $ echo "abc %-= def." | sed 's/\s/X/g'
|
---|
2230 | abcX%-=Xdef.
|
---|
2231 |
|
---|
2232 | â\Sâ
|
---|
2233 | Matches non-whitespace characters.
|
---|
2234 |
|
---|
2235 | $ echo "abc %-= def." | sed 's/\S/X/g'
|
---|
2236 | XXX XXX XXXX
|
---|
2237 |
|
---|
2238 | â\<â
|
---|
2239 | Matches the beginning of a word.
|
---|
2240 |
|
---|
2241 | $ echo "abc %-= def." | sed 's/\</X/g'
|
---|
2242 | Xabc %-= Xdef.
|
---|
2243 |
|
---|
2244 | â\>â
|
---|
2245 | Matches the end of a word.
|
---|
2246 |
|
---|
2247 | $ echo "abc %-= def." | sed 's/\>/X/g'
|
---|
2248 | abcX %-= defX.
|
---|
2249 |
|
---|
2250 | â\`â
|
---|
2251 | Matches only at the start of pattern space. This is different from
|
---|
2252 | â^â in multi-line mode.
|
---|
2253 |
|
---|
2254 | Compare the following two examples:
|
---|
2255 |
|
---|
2256 | $ printf "a\nb\nc\n" | sed 'N;N;s/^/X/gm'
|
---|
2257 | Xa
|
---|
2258 | Xb
|
---|
2259 | Xc
|
---|
2260 |
|
---|
2261 | $ printf "a\nb\nc\n" | sed 'N;N;s/\`/X/gm'
|
---|
2262 | Xa
|
---|
2263 | b
|
---|
2264 | c
|
---|
2265 |
|
---|
2266 | â\'â
|
---|
2267 | Matches only at the end of pattern space. This is different from
|
---|
2268 | â$â in multi-line mode.
|
---|
2269 |
|
---|
2270 |
|
---|
2271 | File: sed.info, Node: Back-references and Subexpressions, Next: Escapes, Prev: regexp extensions, Up: sed regular expressions
|
---|
2272 |
|
---|
2273 | 5.7 Back-references and Subexpressions
|
---|
2274 | ======================================
|
---|
2275 |
|
---|
2276 | âback-referencesâ are regular expression commands which refer to a
|
---|
2277 | previous part of the matched regular expression. Back-references are
|
---|
2278 | specified with backslash and a single digit (e.g. â\1â). The part of
|
---|
2279 | the regular expression they refer to is called a âsubexpressionâ, and is
|
---|
2280 | designated with parentheses.
|
---|
2281 |
|
---|
2282 | Back-references and subexpressions are used in two cases: in the
|
---|
2283 | regular expression search pattern, and in the REPLACEMENT part of the
|
---|
2284 | âsâ command (*note Regular Expression Addresses: Regexp Addresses. and
|
---|
2285 | *note The "s" Command::).
|
---|
2286 |
|
---|
2287 | In a regular expression pattern, back-references are used to match
|
---|
2288 | the same content as a previously matched subexpression. In the
|
---|
2289 | following example, the subexpression is â.â - any single character
|
---|
2290 | (being surrounded by parentheses makes it a subexpression). The
|
---|
2291 | back-reference â\1â asks to match the same content (same character) as
|
---|
2292 | the sub-expression.
|
---|
2293 |
|
---|
2294 | The command below matches words starting with any character, followed
|
---|
2295 | by the letter âoâ, followed by the same character as the first.
|
---|
2296 |
|
---|
2297 | $ sed -E -n '/^(.)o\1$/p' /usr/share/dict/words
|
---|
2298 | bob
|
---|
2299 | mom
|
---|
2300 | non
|
---|
2301 | pop
|
---|
2302 | sos
|
---|
2303 | tot
|
---|
2304 | wow
|
---|
2305 |
|
---|
2306 | Multiple subexpressions are automatically numbered from
|
---|
2307 | left-to-right. This command searches for 6-letter palindromes (the
|
---|
2308 | first three letters are 3 subexpressions, followed by 3 back-references
|
---|
2309 | in reverse order):
|
---|
2310 |
|
---|
2311 | $ sed -E -n '/^(.)(.)(.)\3\2\1$/p' /usr/share/dict/words
|
---|
2312 | redder
|
---|
2313 |
|
---|
2314 | In the âsâ command, back-references can be used in the REPLACEMENT
|
---|
2315 | part to refer back to subexpressions in the REGEXP part.
|
---|
2316 |
|
---|
2317 | The following example uses two subexpressions in the regular
|
---|
2318 | expression to match two space-separated words. The back-references in
|
---|
2319 | the REPLACEMENT part prints the words in a different order:
|
---|
2320 |
|
---|
2321 | $ echo "James Bond" | sed -E 's/(.*) (.*)/The name is \2, \1 \2./'
|
---|
2322 | The name is Bond, James Bond.
|
---|
2323 |
|
---|
2324 | When used with alternation, if the group does not participate in the
|
---|
2325 | match then the back-reference makes the whole match fail. For example,
|
---|
2326 | âa(.)|b\1â will not match âbaâ. When multiple regular expressions are
|
---|
2327 | given with â-eâ or from a file (â-f FILEâ), back-references are local to
|
---|
2328 | each expression.
|
---|
2329 |
|
---|
2330 |
|
---|
2331 | File: sed.info, Node: Escapes, Next: Locale Considerations, Prev: Back-references and Subexpressions, Up: sed regular expressions
|
---|
2332 |
|
---|
2333 | 5.8 Escape Sequences - specifying special characters
|
---|
2334 | ====================================================
|
---|
2335 |
|
---|
2336 | Until this chapter, we have only encountered escapes of the form â\^â,
|
---|
2337 | which tell âsedâ not to interpret the circumflex as a special character,
|
---|
2338 | but rather to take it literally. For example, â\*â matches a single
|
---|
2339 | asterisk rather than zero or more backslashes.
|
---|
2340 |
|
---|
2341 | This chapter introduces another kind of escape(1)âthat is, escapes
|
---|
2342 | that are applied to a character or sequence of characters that
|
---|
2343 | ordinarily are taken literally, and that âsedâ replaces with a special
|
---|
2344 | character. This provides a way of encoding non-printable characters in
|
---|
2345 | patterns in a visible manner. There is no restriction on the appearance
|
---|
2346 | of non-printing characters in a âsedâ script but when a script is being
|
---|
2347 | prepared in the shell or by text editing, it is usually easier to use
|
---|
2348 | one of the following escape sequences than the binary character it
|
---|
2349 | represents:
|
---|
2350 |
|
---|
2351 | The list of these escapes is:
|
---|
2352 |
|
---|
2353 | â\aâ
|
---|
2354 | Produces or matches a BEL character, that is an âalertâ (ASCII 7).
|
---|
2355 |
|
---|
2356 | â\fâ
|
---|
2357 | Produces or matches a form feed (ASCII 12).
|
---|
2358 |
|
---|
2359 | â\nâ
|
---|
2360 | Produces or matches a newline (ASCII 10).
|
---|
2361 |
|
---|
2362 | â\râ
|
---|
2363 | Produces or matches a carriage return (ASCII 13).
|
---|
2364 |
|
---|
2365 | â\tâ
|
---|
2366 | Produces or matches a horizontal tab (ASCII 9).
|
---|
2367 |
|
---|
2368 | â\vâ
|
---|
2369 | Produces or matches a so called âvertical tabâ (ASCII 11).
|
---|
2370 |
|
---|
2371 | â\cXâ
|
---|
2372 | Produces or matches âCONTROL-Xâ, where X is any character. The
|
---|
2373 | precise effect of â\cXâ is as follows: if X is a lower case letter,
|
---|
2374 | it is converted to upper case. Then bit 6 of the character (hex
|
---|
2375 | 40) is inverted. Thus â\czâ becomes hex 1A, but â\c{â becomes hex
|
---|
2376 | 3B, while â\c;â becomes hex 7B.
|
---|
2377 |
|
---|
2378 | â\dXXXâ
|
---|
2379 | Produces or matches a character whose decimal ASCII value is XXX.
|
---|
2380 |
|
---|
2381 | â\oXXXâ
|
---|
2382 | Produces or matches a character whose octal ASCII value is XXX.
|
---|
2383 |
|
---|
2384 | â\xXXâ
|
---|
2385 | Produces or matches a character whose hexadecimal ASCII value is
|
---|
2386 | XX.
|
---|
2387 |
|
---|
2388 | â\bâ (backspace) was omitted because of the conflict with the
|
---|
2389 | existing âword boundaryâ meaning.
|
---|
2390 |
|
---|
2391 | 5.8.1 Escaping Precedence
|
---|
2392 | -------------------------
|
---|
2393 |
|
---|
2394 | GNU âsedâ processes escape sequences _before_ passing the text onto the
|
---|
2395 | regular-expression matching of the âs///â command and Address matching.
|
---|
2396 | Thus the following two commands are equivalent (â0x5eâ is the
|
---|
2397 | hexadecimal ASCII value of the character â^â):
|
---|
2398 |
|
---|
2399 | $ echo 'a^c' | sed 's/^/b/'
|
---|
2400 | ba^c
|
---|
2401 |
|
---|
2402 | $ echo 'a^c' | sed 's/\x5e/b/'
|
---|
2403 | ba^c
|
---|
2404 |
|
---|
2405 | As are the following (â0x5bâ,â0x5dâ are the hexadecimal ASCII values
|
---|
2406 | of â[â,â]â, respectively):
|
---|
2407 |
|
---|
2408 | $ echo abc | sed 's/[a]/x/'
|
---|
2409 | Xbc
|
---|
2410 | $ echo abc | sed 's/\x5ba\x5d/x/'
|
---|
2411 | Xbc
|
---|
2412 |
|
---|
2413 | However it is recommended to avoid such special characters due to
|
---|
2414 | unexpected edge-cases. For example, the following are not equivalent:
|
---|
2415 |
|
---|
2416 | $ echo 'a^c' | sed 's/\^/b/'
|
---|
2417 | abc
|
---|
2418 |
|
---|
2419 | $ echo 'a^c' | sed 's/\\\x5e/b/'
|
---|
2420 | a^c
|
---|
2421 |
|
---|
2422 | ---------- Footnotes ----------
|
---|
2423 |
|
---|
2424 | (1) All the escapes introduced here are GNU extensions, with the
|
---|
2425 | exception of â\nâ. In basic regular expression mode, setting
|
---|
2426 | âPOSIXLY_CORRECTâ disables them inside bracket expressions.
|
---|
2427 |
|
---|
2428 |
|
---|
2429 | File: sed.info, Node: Locale Considerations, Prev: Escapes, Up: sed regular expressions
|
---|
2430 |
|
---|
2431 | 5.9 Multibyte characters and Locale Considerations
|
---|
2432 | ==================================================
|
---|
2433 |
|
---|
2434 | GNU âsedâ processes valid multibyte characters in multibyte locales
|
---|
2435 | (e.g. âUTF-8â). (1)
|
---|
2436 |
|
---|
2437 | The following example uses the Greek letter Capital Sigma (Σ, Unicode
|
---|
2438 | code point â0x03A3â). In a âUTF-8â locale, âsedâ correctly processes
|
---|
2439 | the Sigma as one character despite it being 2 octets (bytes):
|
---|
2440 |
|
---|
2441 | $ locale | grep LANG
|
---|
2442 | LANG=en_US.UTF-8
|
---|
2443 |
|
---|
2444 | $ printf 'a\u03A3b'
|
---|
2445 | aΣb
|
---|
2446 |
|
---|
2447 | $ printf 'a\u03A3b' | sed 's/./X/g'
|
---|
2448 | XXX
|
---|
2449 |
|
---|
2450 | $ printf 'a\u03A3b' | od -tx1 -An
|
---|
2451 | 61 ce a3 62
|
---|
2452 |
|
---|
2453 | To force âsedâ to process octets separately, use the âCâ locale (also
|
---|
2454 | known as the âPOSIXâ locale):
|
---|
2455 |
|
---|
2456 | $ printf 'a\u03A3b' | LC_ALL=C sed 's/./X/g'
|
---|
2457 | XXXX
|
---|
2458 |
|
---|
2459 | 5.9.1 Invalid multibyte characters
|
---|
2460 | ----------------------------------
|
---|
2461 |
|
---|
2462 | âsedââs regular expressions _do not_ match invalid multibyte sequences
|
---|
2463 | in a multibyte locale.
|
---|
2464 |
|
---|
2465 | In the following examples, the ascii value â0xCEâ is an incomplete
|
---|
2466 | multibyte character (shown here as ᅵ). The regular expression â.â does
|
---|
2467 | not match it:
|
---|
2468 |
|
---|
2469 | $ printf 'a\xCEb\n'
|
---|
2470 | aᅵe
|
---|
2471 |
|
---|
2472 | $ printf 'a\xCEb\n' | sed 's/./X/g'
|
---|
2473 | XᅵX
|
---|
2474 |
|
---|
2475 | $ printf 'a\xCEc\n' | sed 's/./X/g' | od -tx1c -An
|
---|
2476 | 58 ce 58 0a
|
---|
2477 | X X \n
|
---|
2478 |
|
---|
2479 | Similarly, the âcatch-allâ regular expression â.*â does not match the
|
---|
2480 | entire line:
|
---|
2481 |
|
---|
2482 | $ printf 'a\xCEc\n' | sed 's/.*//' | od -tx1c -An
|
---|
2483 | ce 63 0a
|
---|
2484 | c \n
|
---|
2485 |
|
---|
2486 | GNU âsedâ offers the special âzâ command to clear the current pattern
|
---|
2487 | space regardless of invalid multibyte characters (i.e. it works like
|
---|
2488 | âs/.*//â but also removes invalid multibyte characters):
|
---|
2489 |
|
---|
2490 | $ printf 'a\xCEc\n' | sed 'z' | od -tx1c -An
|
---|
2491 | 0a
|
---|
2492 | \n
|
---|
2493 |
|
---|
2494 | Alternatively, force the âCâ locale to process each octet separately
|
---|
2495 | (every octet is a valid character in the âCâ locale):
|
---|
2496 |
|
---|
2497 | $ printf 'a\xCEc\n' | LC_ALL=C sed 's/.*//' | od -tx1c -An
|
---|
2498 | 0a
|
---|
2499 | \n
|
---|
2500 |
|
---|
2501 | âsedââs inability to process invalid multibyte characters can be used
|
---|
2502 | to detect such invalid sequences in a file. In the following examples,
|
---|
2503 | the â\xCE\xCEâ is an invalid multibyte sequence, while â\xCE\A3â is a
|
---|
2504 | valid multibyte sequence (of the Greek Sigma character).
|
---|
2505 |
|
---|
2506 | The following âsedâ program removes all valid characters using âs/.//gâ.
|
---|
2507 | Any content left in the pattern space (the invalid characters) are added
|
---|
2508 | to the hold space using the âHâ command. On the last line (â$â), the
|
---|
2509 | hold space is retrieved (âxâ), newlines are removed (âs/\n//gâ), and any
|
---|
2510 | remaining octets are printed unambiguously (âlâ). Thus, any invalid
|
---|
2511 | multibyte sequences are printed as octal values:
|
---|
2512 |
|
---|
2513 | $ printf 'ab\nc\n\xCE\xCEde\n\xCE\xA3f\n' > invalid.txt
|
---|
2514 |
|
---|
2515 | $ cat invalid.txt
|
---|
2516 | ab
|
---|
2517 | c
|
---|
2518 | ᅵᅵde
|
---|
2519 | Σf
|
---|
2520 |
|
---|
2521 | $ sed -n 's/.//g ; H ; ${x;s/\n//g;l}' invalid.txt
|
---|
2522 | \316\316$
|
---|
2523 |
|
---|
2524 | With a few more commands, âsedâ can print the exact line number
|
---|
2525 | corresponding to each invalid characters (line 3). These characters can
|
---|
2526 | then be removed by forcing the âCâ locale and using octal escape
|
---|
2527 | sequences:
|
---|
2528 |
|
---|
2529 | $ sed -n 's/.//g;=;l' invalid.txt | paste - - | awk '$2!="$"'
|
---|
2530 | 3 \316\316$
|
---|
2531 |
|
---|
2532 | $ LC_ALL=C sed '3s/\o316\o316//' invalid.txt > fixed.txt
|
---|
2533 |
|
---|
2534 | 5.9.2 Upper/Lower case conversion
|
---|
2535 | ---------------------------------
|
---|
2536 |
|
---|
2537 | GNU âsedââs substitute command (âsâ) supports upper/lower case
|
---|
2538 | conversions using â\Uâ,â\Lâ codes. These conversions support multibyte
|
---|
2539 | characters:
|
---|
2540 |
|
---|
2541 | $ printf 'ABC\u03a3\n'
|
---|
2542 | ABCΣ
|
---|
2543 |
|
---|
2544 | $ printf 'ABC\u03a3\n' | sed 's/.*/\L&/'
|
---|
2545 | abcÏ
|
---|
2546 |
|
---|
2547 | *Note The "s" Command::.
|
---|
2548 |
|
---|
2549 | 5.9.3 Multibyte regexp character classes
|
---|
2550 | ----------------------------------------
|
---|
2551 |
|
---|
2552 | In other locales, the sorting sequence is not specified, and â[a-d]â
|
---|
2553 | might be equivalent to â[abcd]â or to â[aBbCcDd]â, or it might fail to
|
---|
2554 | match any character, or the set of characters that it matches might even
|
---|
2555 | be erratic. To obtain the traditional interpretation of bracket
|
---|
2556 | expressions, you can use the âCâ locale by setting the âLC_ALLâ
|
---|
2557 | environment variable to the value âCâ.
|
---|
2558 |
|
---|
2559 | # TODO: is there any real-world system/locale where 'A'
|
---|
2560 | # is replaced by '-' ?
|
---|
2561 | $ echo A | sed 's/[a-z]/-/'
|
---|
2562 | A
|
---|
2563 |
|
---|
2564 | Their interpretation depends on the âLC_CTYPEâ locale; for example,
|
---|
2565 | â[[:alnum:]]â means the character class of numbers and letters in the
|
---|
2566 | current locale.
|
---|
2567 |
|
---|
2568 | TODO: show example of collation
|
---|
2569 |
|
---|
2570 | # TODO: this works on glibc systems, not on musl-libc/freebsd/macosx.
|
---|
2571 | $ printf 'cliché\n' | LC_ALL=fr_FR.utf8 sed 's/[[=e=]]/X/g'
|
---|
2572 | clichX
|
---|
2573 |
|
---|
2574 | ---------- Footnotes ----------
|
---|
2575 |
|
---|
2576 | (1) Some regexp edge-cases depends on the operating system and libc
|
---|
2577 | implementation. The examples shown are known to work as-expected on
|
---|
2578 | GNU/Linux systems using glibc.
|
---|
2579 |
|
---|
2580 |
|
---|
2581 | File: sed.info, Node: advanced sed, Next: Examples, Prev: sed regular expressions, Up: Top
|
---|
2582 |
|
---|
2583 | 6 Advanced âsedâ: cycles and buffers
|
---|
2584 | ************************************
|
---|
2585 |
|
---|
2586 | * Menu:
|
---|
2587 |
|
---|
2588 | * Execution Cycle:: How âsedâ works
|
---|
2589 | * Hold and Pattern Buffers::
|
---|
2590 | * Multiline techniques:: Using D,G,H,N,P to process multiple lines
|
---|
2591 | * Branching and flow control::
|
---|
2592 |
|
---|
2593 |
|
---|
2594 | File: sed.info, Node: Execution Cycle, Next: Hold and Pattern Buffers, Up: advanced sed
|
---|
2595 |
|
---|
2596 | 6.1 How âsedâ Works
|
---|
2597 | ===================
|
---|
2598 |
|
---|
2599 | âsedâ maintains two data buffers: the active _pattern_ space, and the
|
---|
2600 | auxiliary _hold_ space. Both are initially empty.
|
---|
2601 |
|
---|
2602 | âsedâ operates by performing the following cycle on each line of
|
---|
2603 | input: first, âsedâ reads one line from the input stream, removes any
|
---|
2604 | trailing newline, and places it in the pattern space. Then commands are
|
---|
2605 | executed; each command can have an address associated to it: addresses
|
---|
2606 | are a kind of condition code, and a command is only executed if the
|
---|
2607 | condition is verified before the command is to be executed.
|
---|
2608 |
|
---|
2609 | When the end of the script is reached, unless the â-nâ option is in
|
---|
2610 | use, the contents of pattern space are printed out to the output stream,
|
---|
2611 | adding back the trailing newline if it was removed.(1) Then the next
|
---|
2612 | cycle starts for the next input line.
|
---|
2613 |
|
---|
2614 | Unless special commands (like âDâ) are used, the pattern space is
|
---|
2615 | deleted between two cycles. The hold space, on the other hand, keeps
|
---|
2616 | its data between cycles (see commands âhâ, âHâ, âxâ, âgâ, âGâ to move
|
---|
2617 | data between both buffers).
|
---|
2618 |
|
---|
2619 | ---------- Footnotes ----------
|
---|
2620 |
|
---|
2621 | (1) Actually, if âsedâ prints a line without the terminating newline,
|
---|
2622 | it will nevertheless print the missing newline as soon as more text is
|
---|
2623 | sent to the same output stream, which gives the âleast expected
|
---|
2624 | surpriseâ even though it does not make commands like âsed -n pâ exactly
|
---|
2625 | identical to âcatâ.
|
---|
2626 |
|
---|
2627 |
|
---|
2628 | File: sed.info, Node: Hold and Pattern Buffers, Next: Multiline techniques, Prev: Execution Cycle, Up: advanced sed
|
---|
2629 |
|
---|
2630 | 6.2 Hold and Pattern Buffers
|
---|
2631 | ============================
|
---|
2632 |
|
---|
2633 | TODO
|
---|
2634 |
|
---|
2635 |
|
---|
2636 | File: sed.info, Node: Multiline techniques, Next: Branching and flow control, Prev: Hold and Pattern Buffers, Up: advanced sed
|
---|
2637 |
|
---|
2638 | 6.3 Multiline techniques - using D,G,H,N,P to process multiple lines
|
---|
2639 | ====================================================================
|
---|
2640 |
|
---|
2641 | Multiple lines can be processed as one buffer using the
|
---|
2642 | âDâ,âGâ,âHâ,âNâ,âPâ. They are similar to their lowercase counterparts
|
---|
2643 | (âdâ,âgâ, âhâ,ânâ,âpâ), except that these commands append or subtract
|
---|
2644 | data while respecting embedded newlines - allowing adding and removing
|
---|
2645 | lines from the pattern and hold spaces.
|
---|
2646 |
|
---|
2647 | They operate as follows:
|
---|
2648 | âDâ
|
---|
2649 | _deletes_ line from the pattern space until the first newline, and
|
---|
2650 | restarts the cycle.
|
---|
2651 |
|
---|
2652 | âGâ
|
---|
2653 | _appends_ line from the hold space to the pattern space, with a
|
---|
2654 | newline before it.
|
---|
2655 |
|
---|
2656 | âHâ
|
---|
2657 | _appends_ line from the pattern space to the hold space, with a
|
---|
2658 | newline before it.
|
---|
2659 |
|
---|
2660 | âNâ
|
---|
2661 | _appends_ line from the input file to the pattern space.
|
---|
2662 |
|
---|
2663 | âPâ
|
---|
2664 | _prints_ line from the pattern space until the first newline.
|
---|
2665 |
|
---|
2666 | The following example illustrates the operation of âNâ and âDâ
|
---|
2667 | commands:
|
---|
2668 |
|
---|
2669 | $ seq 6 | sed -n 'N;l;D'
|
---|
2670 | 1\n2$
|
---|
2671 | 2\n3$
|
---|
2672 | 3\n4$
|
---|
2673 | 4\n5$
|
---|
2674 | 5\n6$
|
---|
2675 |
|
---|
2676 | 1. âsedâ starts by reading the first line into the pattern space (i.e.
|
---|
2677 | â1â).
|
---|
2678 | 2. At the beginning of every cycle, the âNâ command appends a newline
|
---|
2679 | and the next line to the pattern space (i.e. â1â, â\nâ, â2â in the
|
---|
2680 | first cycle).
|
---|
2681 | 3. The âlâ command prints the content of the pattern space
|
---|
2682 | unambiguously.
|
---|
2683 | 4. The âDâ command then removes the content of pattern space up to the
|
---|
2684 | first newline (leaving â2â at the end of the first cycle).
|
---|
2685 | 5. At the next cycle the âNâ command appends a newline and the next
|
---|
2686 | input line to the pattern space (e.g. â2â, â\nâ, â3â).
|
---|
2687 |
|
---|
2688 | A common technique to process blocks of text such as paragraphs
|
---|
2689 | (instead of line-by-line) is using the following construct:
|
---|
2690 |
|
---|
2691 | sed '/./{H;$!d} ; x ; s/REGEXP/REPLACEMENT/'
|
---|
2692 |
|
---|
2693 | 1. The first expression, â/./{H;$!d}â operates on all non-empty lines,
|
---|
2694 | and adds the current line (in the pattern space) to the hold space.
|
---|
2695 | On all lines except the last, the pattern space is deleted and the
|
---|
2696 | cycle is restarted.
|
---|
2697 |
|
---|
2698 | 2. The other expressions âxâ and âsâ are executed only on empty lines
|
---|
2699 | (i.e. paragraph separators). The âxâ command fetches the
|
---|
2700 | accumulated lines from the hold space back to the pattern space.
|
---|
2701 | The âs///â command then operates on all the text in the paragraph
|
---|
2702 | (including the embedded newlines).
|
---|
2703 |
|
---|
2704 | The following example demonstrates this technique:
|
---|
2705 | $ cat input.txt
|
---|
2706 | a a a aa aaa
|
---|
2707 | aaaa aaaa aa
|
---|
2708 | aaaa aaa aaa
|
---|
2709 |
|
---|
2710 | bbbb bbb bbb
|
---|
2711 | bb bb bbb bb
|
---|
2712 | bbbbbbbb bbb
|
---|
2713 |
|
---|
2714 | ccc ccc cccc
|
---|
2715 | cccc ccccc c
|
---|
2716 | cc cc cc cc
|
---|
2717 |
|
---|
2718 | $ sed '/./{H;$!d} ; x ; s/^/\nSTART-->/ ; s/$/\n<--END/' input.txt
|
---|
2719 |
|
---|
2720 | START-->
|
---|
2721 | a a a aa aaa
|
---|
2722 | aaaa aaaa aa
|
---|
2723 | aaaa aaa aaa
|
---|
2724 | <--END
|
---|
2725 |
|
---|
2726 | START-->
|
---|
2727 | bbbb bbb bbb
|
---|
2728 | bb bb bbb bb
|
---|
2729 | bbbbbbbb bbb
|
---|
2730 | <--END
|
---|
2731 |
|
---|
2732 | START-->
|
---|
2733 | ccc ccc cccc
|
---|
2734 | cccc ccccc c
|
---|
2735 | cc cc cc cc
|
---|
2736 | <--END
|
---|
2737 |
|
---|
2738 | For more annotated examples, *note Text search across multiple
|
---|
2739 | lines:: and *note Line length adjustment::.
|
---|
2740 |
|
---|
2741 |
|
---|
2742 | File: sed.info, Node: Branching and flow control, Prev: Multiline techniques, Up: advanced sed
|
---|
2743 |
|
---|
2744 | 6.4 Branching and Flow Control
|
---|
2745 | ==============================
|
---|
2746 |
|
---|
2747 | The branching commands âbâ, âtâ, and âTâ enable changing the flow of
|
---|
2748 | âsedâ programs.
|
---|
2749 |
|
---|
2750 | By default, âsedâ reads an input line into the pattern buffer, then
|
---|
2751 | continues to processes all commands in order. Commands without
|
---|
2752 | addresses affect all lines. Commands with addresses affect only
|
---|
2753 | matching lines. *Note Execution Cycle:: and *note Addresses overview::.
|
---|
2754 |
|
---|
2755 | âsedâ does not support a typical âif/thenâ construct. Instead, some
|
---|
2756 | commands can be used as conditionals or to change the default flow
|
---|
2757 | control:
|
---|
2758 |
|
---|
2759 | âdâ
|
---|
2760 | delete (clears) the current pattern space, and restart the program
|
---|
2761 | cycle without processing the rest of the commands and without
|
---|
2762 | printing the pattern space.
|
---|
2763 |
|
---|
2764 | âDâ
|
---|
2765 | delete the contents of the pattern space _up to the first newline_,
|
---|
2766 | and restart the program cycle without processing the rest of the
|
---|
2767 | commands and without printing the pattern space.
|
---|
2768 |
|
---|
2769 | â[addr]Xâ
|
---|
2770 | â[addr]{ X ; X ; X }â
|
---|
2771 | â/regexp/Xâ
|
---|
2772 | â/regexp/{ X ; X ; X }â
|
---|
2773 | Addresses and regular expressions can be used as an âif/thenâ
|
---|
2774 | conditional: If [ADDR] matches the current pattern space, execute
|
---|
2775 | the command(s). For example: The command â/^#/dâ means: _if_ the
|
---|
2776 | current pattern matches the regular expression â^#â (a line
|
---|
2777 | starting with a hash), _then_ execute the âdâ command: delete the
|
---|
2778 | line without printing it, and restart the program cycle
|
---|
2779 | immediately.
|
---|
2780 |
|
---|
2781 | âbâ
|
---|
2782 | branch unconditionally (that is: always jump to a label, skipping
|
---|
2783 | or repeating other commands, without restarting a new cycle).
|
---|
2784 | Combined with an address, the branch can be conditionally executed
|
---|
2785 | on matched lines.
|
---|
2786 |
|
---|
2787 | âtâ
|
---|
2788 | branch conditionally (that is: jump to a label) _only if_ a âs///â
|
---|
2789 | command has succeeded since the last input line was read or another
|
---|
2790 | conditional branch was taken.
|
---|
2791 |
|
---|
2792 | âTâ
|
---|
2793 | similar but opposite to the âtâ command: branch only if there has
|
---|
2794 | been _no_ successful substitutions since the last input line was
|
---|
2795 | read.
|
---|
2796 |
|
---|
2797 | The following two âsedâ programs are equivalent. The first
|
---|
2798 | (contrived) example uses the âbâ command to skip the âs///â command on
|
---|
2799 | lines containing â1â. The second example uses an address with negation
|
---|
2800 | (â!â) to perform substitution only on desired lines. The ây///â command
|
---|
2801 | is still executed on all lines:
|
---|
2802 |
|
---|
2803 | $ printf '%s\n' a1 a2 a3 | sed -E '/1/bx ; s/a/z/ ; :x ; y/123/456/'
|
---|
2804 | a4
|
---|
2805 | z5
|
---|
2806 | z6
|
---|
2807 |
|
---|
2808 | $ printf '%s\n' a1 a2 a3 | sed -E '/1/!s/a/z/ ; y/123/456/'
|
---|
2809 | a4
|
---|
2810 | z5
|
---|
2811 | z6
|
---|
2812 |
|
---|
2813 | 6.4.1 Branching and Cycles
|
---|
2814 | --------------------------
|
---|
2815 |
|
---|
2816 | The âbâ,âtâ and âTâ commands can be followed by a label (typically a
|
---|
2817 | single letter). Labels are defined with a colon followed by one or more
|
---|
2818 | letters (e.g. â:xâ). If the label is omitted the branch commands
|
---|
2819 | restart the cycle. Note the difference between branching to a label and
|
---|
2820 | restarting the cycle: when a cycle is restarted, âsedâ first prints the
|
---|
2821 | current content of the pattern space, then reads the next input line
|
---|
2822 | into the pattern space; Jumping to a label (even if it is at the
|
---|
2823 | beginning of the program) does not print the pattern space and does not
|
---|
2824 | read the next input line.
|
---|
2825 |
|
---|
2826 | The following program is a no-op. The âbâ command (the only command
|
---|
2827 | in the program) does not have a label, and thus simply restarts the
|
---|
2828 | cycle. On each cycle, the pattern space is printed and the next input
|
---|
2829 | line is read:
|
---|
2830 |
|
---|
2831 | $ seq 3 | sed b
|
---|
2832 | 1
|
---|
2833 | 2
|
---|
2834 | 3
|
---|
2835 |
|
---|
2836 | The following example is an infinite-loop - it doesnât terminate and
|
---|
2837 | doesnât print anything. The âbâ command jumps to the âxâ label, and a
|
---|
2838 | new cycle is never started:
|
---|
2839 |
|
---|
2840 | $ seq 3 | sed ':x ; bx'
|
---|
2841 |
|
---|
2842 | # The above command requires gnu sed (which supports additional
|
---|
2843 | # commands following a label, without a newline). A portable equivalent:
|
---|
2844 | # sed -e ':x' -e bx
|
---|
2845 |
|
---|
2846 | Branching is often complemented with the ânâ or âNâ commands: both
|
---|
2847 | commands read the next input line into the pattern space without waiting
|
---|
2848 | for the cycle to restart. Before reading the next input line, ânâ
|
---|
2849 | prints the current pattern space then empties it, while âNâ appends a
|
---|
2850 | newline and the next input line to the pattern space.
|
---|
2851 |
|
---|
2852 | Consider the following two examples:
|
---|
2853 |
|
---|
2854 | $ seq 3 | sed ':x ; n ; bx'
|
---|
2855 | 1
|
---|
2856 | 2
|
---|
2857 | 3
|
---|
2858 |
|
---|
2859 | $ seq 3 | sed ':x ; N ; bx'
|
---|
2860 | 1
|
---|
2861 | 2
|
---|
2862 | 3
|
---|
2863 |
|
---|
2864 | ⢠Both examples do not inf-loop, despite never starting a new cycle.
|
---|
2865 |
|
---|
2866 | ⢠In the first example, the ânâ commands first prints the content of
|
---|
2867 | the pattern space, empties the pattern space then reads the next
|
---|
2868 | input line.
|
---|
2869 |
|
---|
2870 | ⢠In the second example, the âNâ commands appends the next input line
|
---|
2871 | to the pattern space (with a newline). Lines are accumulated in
|
---|
2872 | the pattern space until there are no more input lines to read, then
|
---|
2873 | the âNâ command terminates the âsedâ program. When the program
|
---|
2874 | terminates, the end-of-cycle actions are performed, and the entire
|
---|
2875 | pattern space is printed.
|
---|
2876 |
|
---|
2877 | ⢠The second example requires GNU âsedâ, because it uses the
|
---|
2878 | non-POSIX-standard behavior of âNâ. See the ââNâ command on the
|
---|
2879 | last lineâ paragraph in *note Reporting Bugs::.
|
---|
2880 |
|
---|
2881 | ⢠To further examine the difference between the two examples, try the
|
---|
2882 | following commands:
|
---|
2883 | printf '%s\n' aa bb cc dd | sed ':x ; n ; = ; bx'
|
---|
2884 | printf '%s\n' aa bb cc dd | sed ':x ; N ; = ; bx'
|
---|
2885 | printf '%s\n' aa bb cc dd | sed ':x ; n ; s/\n/***/ ; bx'
|
---|
2886 | printf '%s\n' aa bb cc dd | sed ':x ; N ; s/\n/***/ ; bx'
|
---|
2887 |
|
---|
2888 | 6.4.2 Branching example: joining lines
|
---|
2889 | --------------------------------------
|
---|
2890 |
|
---|
2891 | As a real-world example of using branching, consider the case of
|
---|
2892 | quoted-printable (https://en.wikipedia.org/wiki/Quoted-printable) files,
|
---|
2893 | typically used to encode email messages. In these files long lines are
|
---|
2894 | split and marked with a âsoft line breakâ consisting of a single â=â
|
---|
2895 | character at the end of the line:
|
---|
2896 |
|
---|
2897 | $ cat jaques.txt
|
---|
2898 | All the wor=
|
---|
2899 | ld's a stag=
|
---|
2900 | e,
|
---|
2901 | And all the=
|
---|
2902 | men and wo=
|
---|
2903 | men merely =
|
---|
2904 | players:
|
---|
2905 | They have t=
|
---|
2906 | heir exits =
|
---|
2907 | and their e=
|
---|
2908 | ntrances;
|
---|
2909 | And one man=
|
---|
2910 | in his tim=
|
---|
2911 | e plays man=
|
---|
2912 | y parts.
|
---|
2913 |
|
---|
2914 | The following program uses an address match â/=$/â as a conditional:
|
---|
2915 | If the current pattern space ends with a â=â, it reads the next input
|
---|
2916 | line using âNâ, replaces all â=â characters which are followed by a
|
---|
2917 | newline, and unconditionally branches (âbâ) to the beginning of the
|
---|
2918 | program without restarting a new cycle. If the pattern space does not
|
---|
2919 | ends with â=â, the default action is performed: the pattern space is
|
---|
2920 | printed and a new cycle is started:
|
---|
2921 |
|
---|
2922 | $ sed ':x ; /=$/ { N ; s/=\n//g ; bx }' jaques.txt
|
---|
2923 | All the world's a stage,
|
---|
2924 | And all the men and women merely players:
|
---|
2925 | They have their exits and their entrances;
|
---|
2926 | And one man in his time plays many parts.
|
---|
2927 |
|
---|
2928 | Hereâs an alternative program with a slightly different approach: On
|
---|
2929 | all lines except the last, âNâ appends the line to the pattern space. A
|
---|
2930 | substitution command then removes soft line breaks (â=â at the end of a
|
---|
2931 | line, i.e. followed by a newline) by replacing them with an empty
|
---|
2932 | string. _if_ the substitution was successful (meaning the pattern space
|
---|
2933 | contained a line which should be joined), The conditional branch command
|
---|
2934 | âtâ jumps to the beginning of the program without completing or
|
---|
2935 | restarting the cycle. If the substitution failed (meaning there were no
|
---|
2936 | soft line breaks), The âtâ command will _not_ branch. Then, âPâ will
|
---|
2937 | print the pattern space content until the first newline, and âDâ will
|
---|
2938 | delete the pattern space content until the first new line. (To learn
|
---|
2939 | more about âNâ, âPâ and âDâ commands *note Multiline techniques::).
|
---|
2940 |
|
---|
2941 | $ sed ':x ; $!N ; s/=\n// ; tx ; P ; D' jaques.txt
|
---|
2942 | All the world's a stage,
|
---|
2943 | And all the men and women merely players:
|
---|
2944 | They have their exits and their entrances;
|
---|
2945 | And one man in his time plays many parts.
|
---|
2946 |
|
---|
2947 | For more line-joining examples *note Joining lines::.
|
---|
2948 |
|
---|
2949 |
|
---|
2950 | File: sed.info, Node: Examples, Next: Limitations, Prev: advanced sed, Up: Top
|
---|
2951 |
|
---|
2952 | 7 Some Sample Scripts
|
---|
2953 | *********************
|
---|
2954 |
|
---|
2955 | Here are some âsedâ scripts to guide you in the art of mastering âsedâ.
|
---|
2956 |
|
---|
2957 | * Menu:
|
---|
2958 |
|
---|
2959 |
|
---|
2960 | Useful one-liners:
|
---|
2961 | * Joining lines::
|
---|
2962 |
|
---|
2963 | Some exotic examples:
|
---|
2964 | * Centering lines::
|
---|
2965 | * Increment a number::
|
---|
2966 | * Rename files to lower case::
|
---|
2967 | * Print bash environment::
|
---|
2968 | * Reverse chars of lines::
|
---|
2969 | * Text search across multiple lines::
|
---|
2970 | * Line length adjustment::
|
---|
2971 | * Adding a header to multiple files::
|
---|
2972 |
|
---|
2973 | Emulating standard utilities:
|
---|
2974 | * tac:: Reverse lines of files
|
---|
2975 | * cat -n:: Numbering lines
|
---|
2976 | * cat -b:: Numbering non-blank lines
|
---|
2977 | * wc -c:: Counting chars
|
---|
2978 | * wc -w:: Counting words
|
---|
2979 | * wc -l:: Counting lines
|
---|
2980 | * head:: Printing the first lines
|
---|
2981 | * tail:: Printing the last lines
|
---|
2982 | * uniq:: Make duplicate lines unique
|
---|
2983 | * uniq -d:: Print duplicated lines of input
|
---|
2984 | * uniq -u:: Remove all duplicated lines
|
---|
2985 | * cat -s:: Squeezing blank lines
|
---|
2986 |
|
---|
2987 |
|
---|
2988 | File: sed.info, Node: Joining lines, Next: Centering lines, Up: Examples
|
---|
2989 |
|
---|
2990 | 7.1 Joining lines
|
---|
2991 | =================
|
---|
2992 |
|
---|
2993 | This section uses âNâ, âDâ and âPâ commands to process multiple lines,
|
---|
2994 | and the âbâ and âtâ commands for branching. *Note Multiline
|
---|
2995 | techniques:: and *note Branching and flow control::.
|
---|
2996 |
|
---|
2997 | Join specific lines (e.g. if lines 2 and 3 need to be joined):
|
---|
2998 |
|
---|
2999 | $ cat lines.txt
|
---|
3000 | hello
|
---|
3001 | hel
|
---|
3002 | lo
|
---|
3003 | hello
|
---|
3004 |
|
---|
3005 | $ sed '2{N;s/\n//;}' lines.txt
|
---|
3006 | hello
|
---|
3007 | hello
|
---|
3008 | hello
|
---|
3009 |
|
---|
3010 | Join backslash-continued lines:
|
---|
3011 |
|
---|
3012 | $ cat 1.txt
|
---|
3013 | this \
|
---|
3014 | is \
|
---|
3015 | a \
|
---|
3016 | long \
|
---|
3017 | line
|
---|
3018 | and another \
|
---|
3019 | line
|
---|
3020 |
|
---|
3021 | $ sed -e ':x /\\$/ { N; s/\\\n//g ; bx }' 1.txt
|
---|
3022 | this is a long line
|
---|
3023 | and another line
|
---|
3024 |
|
---|
3025 |
|
---|
3026 | #TODO: The above requires gnu sed.
|
---|
3027 | # non-gnu seds need newlines after ':' and 'b'
|
---|
3028 |
|
---|
3029 | Join lines that start with whitespace (e.g SMTP headers):
|
---|
3030 |
|
---|
3031 | $ cat 2.txt
|
---|
3032 | Subject: Hello
|
---|
3033 | World
|
---|
3034 | Content-Type: multipart/alternative;
|
---|
3035 | boundary=94eb2c190cc6370f06054535da6a
|
---|
3036 | Date: Tue, 3 Jan 2017 19:41:16 +0000 (GMT)
|
---|
3037 | Authentication-Results: mx.gnu.org;
|
---|
3038 | dkim=pass header.i=@gnu.org;
|
---|
3039 | spf=pass
|
---|
3040 | Message-ID: <abcdef@gnu.org>
|
---|
3041 | From: John Doe <jdoe@gnu.org>
|
---|
3042 | To: Jane Smith <jsmith@gnu.org>
|
---|
3043 |
|
---|
3044 | $ sed -E ':a ; $!N ; s/\n\s+/ / ; ta ; P ; D' 2.txt
|
---|
3045 | Subject: Hello World
|
---|
3046 | Content-Type: multipart/alternative; boundary=94eb2c190cc6370f06054535da6a
|
---|
3047 | Date: Tue, 3 Jan 2017 19:41:16 +0000 (GMT)
|
---|
3048 | Authentication-Results: mx.gnu.org; dkim=pass header.i=@gnu.org; spf=pass
|
---|
3049 | Message-ID: <abcdef@gnu.org>
|
---|
3050 | From: John Doe <jdoe@gnu.org>
|
---|
3051 | To: Jane Smith <jsmith@gnu.org>
|
---|
3052 |
|
---|
3053 | # A portable (non-gnu) variation:
|
---|
3054 | # sed -e :a -e '$!N;s/\n */ /;ta' -e 'P;D'
|
---|
3055 |
|
---|
3056 |
|
---|
3057 | File: sed.info, Node: Centering lines, Next: Increment a number, Prev: Joining lines, Up: Examples
|
---|
3058 |
|
---|
3059 | 7.2 Centering Lines
|
---|
3060 | ===================
|
---|
3061 |
|
---|
3062 | This script centers all lines of a file on a 80 columns width. To
|
---|
3063 | change that width, the number in â\{...\}â must be replaced, and the
|
---|
3064 | number of added spaces also must be changed.
|
---|
3065 |
|
---|
3066 | Note how the buffer commands are used to separate parts in the
|
---|
3067 | regular expressions to be matchedâthis is a common technique.
|
---|
3068 |
|
---|
3069 | #!/usr/bin/sed -f
|
---|
3070 |
|
---|
3071 | # Put 80 spaces in the buffer
|
---|
3072 | 1 {
|
---|
3073 | x
|
---|
3074 | s/^$/ /
|
---|
3075 | s/^.*$/&&&&&&&&/
|
---|
3076 | x
|
---|
3077 | }
|
---|
3078 |
|
---|
3079 | # delete leading and trailing spaces
|
---|
3080 | y/<TAB>/ /
|
---|
3081 | s/^ *//
|
---|
3082 | s/ *$//
|
---|
3083 |
|
---|
3084 | # add a newline and 80 spaces to end of line
|
---|
3085 | G
|
---|
3086 |
|
---|
3087 | # keep first 81 chars (80 + a newline)
|
---|
3088 | s/^\(.\{81\}\).*$/\1/
|
---|
3089 |
|
---|
3090 | # \2 matches half of the spaces, which are moved to the beginning
|
---|
3091 | s/^\(.*\)\n\(.*\)\2/\2\1/
|
---|
3092 |
|
---|
3093 |
|
---|
3094 | File: sed.info, Node: Increment a number, Next: Rename files to lower case, Prev: Centering lines, Up: Examples
|
---|
3095 |
|
---|
3096 | 7.3 Increment a Number
|
---|
3097 | ======================
|
---|
3098 |
|
---|
3099 | This script is one of a few that demonstrate how to do arithmetic in
|
---|
3100 | âsedâ. This is indeed possible,(1) but must be done manually.
|
---|
3101 |
|
---|
3102 | To increment one number you just add 1 to last digit, replacing it by
|
---|
3103 | the following digit. There is one exception: when the digit is a nine
|
---|
3104 | the previous digits must be also incremented until you donât have a
|
---|
3105 | nine.
|
---|
3106 |
|
---|
3107 | This solution by Bruno Haible is very clever and smart because it
|
---|
3108 | uses a single buffer; if you donât have this limitation, the algorithm
|
---|
3109 | used in *note Numbering lines: cat -n, is faster. It works by replacing
|
---|
3110 | trailing nines with an underscore, then using multiple âsâ commands to
|
---|
3111 | increment the last digit, and then again substituting underscores with
|
---|
3112 | zeros.
|
---|
3113 |
|
---|
3114 | #!/usr/bin/sed -f
|
---|
3115 |
|
---|
3116 | /[^0-9]/ d
|
---|
3117 |
|
---|
3118 | # replace all trailing 9s by _ (any other character except digits, could
|
---|
3119 | # be used)
|
---|
3120 | :d
|
---|
3121 | s/9\(_*\)$/_\1/
|
---|
3122 | td
|
---|
3123 |
|
---|
3124 | # incr last digit only. The first line adds a most-significant
|
---|
3125 | # digit of 1 if we have to add a digit.
|
---|
3126 |
|
---|
3127 | s/^\(_*\)$/1\1/; tn
|
---|
3128 | s/8\(_*\)$/9\1/; tn
|
---|
3129 | s/7\(_*\)$/8\1/; tn
|
---|
3130 | s/6\(_*\)$/7\1/; tn
|
---|
3131 | s/5\(_*\)$/6\1/; tn
|
---|
3132 | s/4\(_*\)$/5\1/; tn
|
---|
3133 | s/3\(_*\)$/4\1/; tn
|
---|
3134 | s/2\(_*\)$/3\1/; tn
|
---|
3135 | s/1\(_*\)$/2\1/; tn
|
---|
3136 | s/0\(_*\)$/1\1/; tn
|
---|
3137 |
|
---|
3138 | :n
|
---|
3139 | y/_/0/
|
---|
3140 |
|
---|
3141 | ---------- Footnotes ----------
|
---|
3142 |
|
---|
3143 | (1) âsedâ guru Greg Ubben wrote an implementation of the âdcâ RPN
|
---|
3144 | calculator! It is distributed together with sed.
|
---|
3145 |
|
---|
3146 |
|
---|
3147 | File: sed.info, Node: Rename files to lower case, Next: Print bash environment, Prev: Increment a number, Up: Examples
|
---|
3148 |
|
---|
3149 | 7.4 Rename Files to Lower Case
|
---|
3150 | ==============================
|
---|
3151 |
|
---|
3152 | This is a pretty strange use of âsedâ. We transform text, and transform
|
---|
3153 | it to be shell commands, then just feed them to shell. Donât worry,
|
---|
3154 | even worse hacks are done when using âsedâ; I have seen a script
|
---|
3155 | converting the output of âdateâ into a âbcâ program!
|
---|
3156 |
|
---|
3157 | The main body of this is the âsedâ script, which remaps the name from
|
---|
3158 | lower to upper (or vice-versa) and even checks out if the remapped name
|
---|
3159 | is the same as the original name. Note how the script is parameterized
|
---|
3160 | using shell variables and proper quoting.
|
---|
3161 |
|
---|
3162 | #! /bin/sh
|
---|
3163 | # rename files to lower/upper case...
|
---|
3164 | #
|
---|
3165 | # usage:
|
---|
3166 | # move-to-lower *
|
---|
3167 | # move-to-upper *
|
---|
3168 | # or
|
---|
3169 | # move-to-lower -R .
|
---|
3170 | # move-to-upper -R .
|
---|
3171 | #
|
---|
3172 |
|
---|
3173 | help()
|
---|
3174 | {
|
---|
3175 | cat << eof
|
---|
3176 | Usage: $0 [-n] [-r] [-h] files...
|
---|
3177 |
|
---|
3178 | -n do nothing, only see what would be done
|
---|
3179 | -R recursive (use find)
|
---|
3180 | -h this message
|
---|
3181 | files files to remap to lower case
|
---|
3182 |
|
---|
3183 | Examples:
|
---|
3184 | $0 -n * (see if everything is ok, then...)
|
---|
3185 | $0 *
|
---|
3186 |
|
---|
3187 | $0 -R .
|
---|
3188 |
|
---|
3189 | eof
|
---|
3190 | }
|
---|
3191 |
|
---|
3192 | apply_cmd='sh'
|
---|
3193 | finder='echo "$@" | tr " " "\n"'
|
---|
3194 | files_only=
|
---|
3195 |
|
---|
3196 | while :
|
---|
3197 | do
|
---|
3198 | case "$1" in
|
---|
3199 | -n) apply_cmd='cat' ;;
|
---|
3200 | -R) finder='find "$@" -type f';;
|
---|
3201 | -h) help ; exit 1 ;;
|
---|
3202 | *) break ;;
|
---|
3203 | esac
|
---|
3204 | shift
|
---|
3205 | done
|
---|
3206 |
|
---|
3207 | if [ -z "$1" ]; then
|
---|
3208 | echo Usage: $0 [-h] [-n] [-r] files...
|
---|
3209 | exit 1
|
---|
3210 | fi
|
---|
3211 |
|
---|
3212 | LOWER='abcdefghijklmnopqrstuvwxyz'
|
---|
3213 | UPPER='ABCDEFGHIJKLMNOPQRSTUVWXYZ'
|
---|
3214 |
|
---|
3215 | case `basename $0` in
|
---|
3216 | *upper*) TO=$UPPER; FROM=$LOWER ;;
|
---|
3217 | *) FROM=$UPPER; TO=$LOWER ;;
|
---|
3218 | esac
|
---|
3219 |
|
---|
3220 | eval $finder | sed -n '
|
---|
3221 |
|
---|
3222 | # remove all trailing slashes
|
---|
3223 | s/\/*$//
|
---|
3224 |
|
---|
3225 | # add ./ if there is no path, only a filename
|
---|
3226 | /\//! s/^/.\//
|
---|
3227 |
|
---|
3228 | # save path+filename
|
---|
3229 | h
|
---|
3230 |
|
---|
3231 | # remove path
|
---|
3232 | s/.*\///
|
---|
3233 |
|
---|
3234 | # do conversion only on filename
|
---|
3235 | y/'$FROM'/'$TO'/
|
---|
3236 |
|
---|
3237 | # now line contains original path+file, while
|
---|
3238 | # hold space contains the new filename
|
---|
3239 | x
|
---|
3240 |
|
---|
3241 | # add converted file name to line, which now contains
|
---|
3242 | # path/file-name\nconverted-file-name
|
---|
3243 | G
|
---|
3244 |
|
---|
3245 | # check if converted file name is equal to original file name,
|
---|
3246 | # if it is, do not print anything
|
---|
3247 | /^.*\/\(.*\)\n\1/b
|
---|
3248 |
|
---|
3249 | # escape special characters for the shell
|
---|
3250 | s/["$`\\]/\\&/g
|
---|
3251 |
|
---|
3252 | # now, transform path/fromfile\n, into
|
---|
3253 | # mv path/fromfile path/tofile and print it
|
---|
3254 | s/^\(.*\/\)\(.*\)\n\(.*\)$/mv "\1\2" "\1\3"/p
|
---|
3255 |
|
---|
3256 | ' | $apply_cmd
|
---|
3257 |
|
---|
3258 |
|
---|
3259 | File: sed.info, Node: Print bash environment, Next: Reverse chars of lines, Prev: Rename files to lower case, Up: Examples
|
---|
3260 |
|
---|
3261 | 7.5 Print âbashâ Environment
|
---|
3262 | ============================
|
---|
3263 |
|
---|
3264 | This script strips the definition of the shell functions from the output
|
---|
3265 | of the âsetâ Bourne-shell command.
|
---|
3266 |
|
---|
3267 | #!/bin/sh
|
---|
3268 |
|
---|
3269 | set | sed -n '
|
---|
3270 | :x
|
---|
3271 |
|
---|
3272 | # if no occurrence of "=()" print and load next line
|
---|
3273 | /=()/! { p; b; }
|
---|
3274 | / () $/! { p; b; }
|
---|
3275 |
|
---|
3276 | # possible start of functions section
|
---|
3277 | # save the line in case this is a var like FOO="() "
|
---|
3278 | h
|
---|
3279 |
|
---|
3280 | # if the next line has a brace, we quit because
|
---|
3281 | # nothing comes after functions
|
---|
3282 | n
|
---|
3283 | /^{/ q
|
---|
3284 |
|
---|
3285 | # print the old line
|
---|
3286 | x; p
|
---|
3287 |
|
---|
3288 | # work on the new line now
|
---|
3289 | x; bx
|
---|
3290 | '
|
---|
3291 |
|
---|
3292 |
|
---|
3293 | File: sed.info, Node: Reverse chars of lines, Next: Text search across multiple lines, Prev: Print bash environment, Up: Examples
|
---|
3294 |
|
---|
3295 | 7.6 Reverse Characters of Lines
|
---|
3296 | ===============================
|
---|
3297 |
|
---|
3298 | This script can be used to reverse the position of characters in lines.
|
---|
3299 | The technique moves two characters at a time, hence it is faster than
|
---|
3300 | more intuitive implementations.
|
---|
3301 |
|
---|
3302 | Note the âtxâ command before the definition of the label. This is
|
---|
3303 | often needed to reset the flag that is tested by the âtâ command.
|
---|
3304 |
|
---|
3305 | Imaginative readers will find uses for this script. An example is
|
---|
3306 | reversing the output of âbannerâ.(1)
|
---|
3307 |
|
---|
3308 | #!/usr/bin/sed -f
|
---|
3309 |
|
---|
3310 | /../! b
|
---|
3311 |
|
---|
3312 | # Reverse a line. Begin embedding the line between two newlines
|
---|
3313 | s/^.*$/\
|
---|
3314 | &\
|
---|
3315 | /
|
---|
3316 |
|
---|
3317 | # Move first character at the end. The regexp matches until
|
---|
3318 | # there are zero or one characters between the markers
|
---|
3319 | tx
|
---|
3320 | :x
|
---|
3321 | s/\(\n.\)\(.*\)\(.\n\)/\3\2\1/
|
---|
3322 | tx
|
---|
3323 |
|
---|
3324 | # Remove the newline markers
|
---|
3325 | s/\n//g
|
---|
3326 |
|
---|
3327 | ---------- Footnotes ----------
|
---|
3328 |
|
---|
3329 | (1) This requires another script to pad the output of banner; for
|
---|
3330 | example
|
---|
3331 |
|
---|
3332 | #! /bin/sh
|
---|
3333 |
|
---|
3334 | banner -w $1 $2 $3 $4 |
|
---|
3335 | sed -e :a -e '/^.\{0,'$1'\}$/ { s/$/ /; ba; }' |
|
---|
3336 | ~/sedscripts/reverseline.sed
|
---|
3337 |
|
---|
3338 |
|
---|
3339 | File: sed.info, Node: Text search across multiple lines, Next: Line length adjustment, Prev: Reverse chars of lines, Up: Examples
|
---|
3340 |
|
---|
3341 | 7.7 Text search across multiple lines
|
---|
3342 | =====================================
|
---|
3343 |
|
---|
3344 | This section uses âNâ and âDâ commands to search for consecutive words
|
---|
3345 | spanning multiple lines. *Note Multiline techniques::.
|
---|
3346 |
|
---|
3347 | These examples deal with finding doubled occurrences of words in a
|
---|
3348 | document.
|
---|
3349 |
|
---|
3350 | Finding doubled words in a single line is easy using GNU âgrepâ and
|
---|
3351 | similarly with GNU âsedâ:
|
---|
3352 |
|
---|
3353 | $ cat two-cities-dup1.txt
|
---|
3354 | It was the best of times,
|
---|
3355 | it was the worst of times,
|
---|
3356 | it was the the age of wisdom,
|
---|
3357 | it was the age of foolishness,
|
---|
3358 |
|
---|
3359 | $ grep -E '\b(\w+)\s+\1\b' two-cities-dup1.txt
|
---|
3360 | it was the the age of wisdom,
|
---|
3361 |
|
---|
3362 | $ grep -n -E '\b(\w+)\s+\1\b' two-cities-dup1.txt
|
---|
3363 | 3:it was the the age of wisdom,
|
---|
3364 |
|
---|
3365 | $ sed -En '/\b(\w+)\s+\1\b/p' two-cities-dup1.txt
|
---|
3366 | it was the the age of wisdom,
|
---|
3367 |
|
---|
3368 | $ sed -En '/\b(\w+)\s+\1\b/{=;p}' two-cities-dup1.txt
|
---|
3369 | 3
|
---|
3370 | it was the the age of wisdom,
|
---|
3371 |
|
---|
3372 | ⢠The regular expression â\b\w+\s+â searches for word-boundary
|
---|
3373 | (â\bâ), followed by one-or-more word-characters (â\w+â), followed
|
---|
3374 | by whitespace (â\s+â). *Note regexp extensions::.
|
---|
3375 |
|
---|
3376 | ⢠Adding parentheses around the â(\w+)â expression creates a
|
---|
3377 | subexpression. The regular expression pattern â(PATTERN)\s+\1â
|
---|
3378 | defines a subexpression (in the parentheses) followed by a
|
---|
3379 | back-reference, separated by whitespace. A successful match means
|
---|
3380 | the PATTERN was repeated twice in succession. *Note
|
---|
3381 | Back-references and Subexpressions::.
|
---|
3382 |
|
---|
3383 | ⢠The word-boundery expression (â\bâ) at both ends ensures partial
|
---|
3384 | words are not matched (e.g. âthe thenâ is not a desired match).
|
---|
3385 |
|
---|
3386 | ⢠The â-Eâ option enables extended regular expression syntax,
|
---|
3387 | alleviating the need to add backslashes before the parenthesis.
|
---|
3388 | *Note ERE syntax::.
|
---|
3389 |
|
---|
3390 | When the doubled word span two lines the above regular expression
|
---|
3391 | will not find them as âgrepâ and âsedâ operate line-by-line.
|
---|
3392 |
|
---|
3393 | By using âNâ and âDâ commands, âsedâ can apply regular expressions on
|
---|
3394 | multiple lines (that is, multiple lines are stored in the pattern space,
|
---|
3395 | and the regular expression works on it):
|
---|
3396 |
|
---|
3397 | $ cat two-cities-dup2.txt
|
---|
3398 | It was the best of times, it was the
|
---|
3399 | worst of times, it was the
|
---|
3400 | the age of wisdom,
|
---|
3401 | it was the age of foolishness,
|
---|
3402 |
|
---|
3403 | $ sed -En '{N; /\b(\w+)\s+\1\b/{=;p} ; D}' two-cities-dup2.txt
|
---|
3404 | 3
|
---|
3405 | worst of times, it was the
|
---|
3406 | the age of wisdom,
|
---|
3407 |
|
---|
3408 | ⢠The âNâ command appends the next line to the pattern space (thus
|
---|
3409 | ensuring it contains two consecutive lines in every cycle).
|
---|
3410 |
|
---|
3411 | ⢠The regular expression uses â\s+â for word separator which matches
|
---|
3412 | both spaces and newlines.
|
---|
3413 |
|
---|
3414 | ⢠The regular expression matches, the entire pattern space is printed
|
---|
3415 | with âpâ. No lines are printed by default due to the â-nâ option.
|
---|
3416 |
|
---|
3417 | ⢠The âDâ removes the first line from the pattern space (up until the
|
---|
3418 | first newline), readying it for the next cycle.
|
---|
3419 |
|
---|
3420 | See the GNU âcoreutilsâ manual for an alternative solution using âtr
|
---|
3421 | -sâ and âuniqâ at
|
---|
3422 | <https://gnu.org/s/coreutils/manual/html_node/Squeezing-and-deleting.html>.
|
---|
3423 |
|
---|
3424 |
|
---|
3425 | File: sed.info, Node: Line length adjustment, Next: Adding a header to multiple files, Prev: Text search across multiple lines, Up: Examples
|
---|
3426 |
|
---|
3427 | 7.8 Line length adjustment
|
---|
3428 | ==========================
|
---|
3429 |
|
---|
3430 | This section uses âNâ and âPâ commands to read and write lines, and the
|
---|
3431 | âbâ command for branching. *Note Multiline techniques:: and *note
|
---|
3432 | Branching and flow control::.
|
---|
3433 |
|
---|
3434 | This (somewhat contrived) example deal with formatting and wrapping
|
---|
3435 | lines of text of the following input file:
|
---|
3436 |
|
---|
3437 | $ cat two-cities-mix.txt
|
---|
3438 | It was the best of times, it was
|
---|
3439 | the worst of times, it
|
---|
3440 | was the age of
|
---|
3441 | wisdom,
|
---|
3442 | it
|
---|
3443 | was
|
---|
3444 | the age
|
---|
3445 | of foolishness,
|
---|
3446 |
|
---|
3447 | The following sed program wraps lines at 40 characters:
|
---|
3448 | $ cat wrap40.sed
|
---|
3449 | # outer loop
|
---|
3450 | :x
|
---|
3451 |
|
---|
3452 | # Append a newline followed by the next input line to the pattern buffer
|
---|
3453 | N
|
---|
3454 |
|
---|
3455 | # Remove all newlines from the pattern buffer
|
---|
3456 | s/\n/ /g
|
---|
3457 |
|
---|
3458 |
|
---|
3459 | # Inner loop
|
---|
3460 | :y
|
---|
3461 |
|
---|
3462 | # Add a newline after the first 40 characters
|
---|
3463 | s/(.{40,40})/\1\n/
|
---|
3464 |
|
---|
3465 | # If there is a newline in the pattern buffer
|
---|
3466 | # (i.e. the previous substitution added a newline)
|
---|
3467 | /\n/ {
|
---|
3468 | # There are newlines in the pattern buffer -
|
---|
3469 | # print the content until the first newline.
|
---|
3470 | P
|
---|
3471 |
|
---|
3472 | # Remove the printed characters and the first newline
|
---|
3473 | s/.*\n//
|
---|
3474 |
|
---|
3475 | # branch to label 'y' - repeat inner loop
|
---|
3476 | by
|
---|
3477 | }
|
---|
3478 |
|
---|
3479 | # No newlines in the pattern buffer - Branch to label 'x' (outer loop)
|
---|
3480 | # and read the next input line
|
---|
3481 | bx
|
---|
3482 |
|
---|
3483 | The wrapped output:
|
---|
3484 | $ sed -E -f wrap40.sed two-cities-mix.txt
|
---|
3485 | It was the best of times, it was the wor
|
---|
3486 | st of times, it was the age of wisdom, i
|
---|
3487 | t was the age of foolishness,
|
---|
3488 |
|
---|
3489 |
|
---|
3490 | File: sed.info, Node: Adding a header to multiple files, Next: tac, Prev: Line length adjustment, Up: Examples
|
---|
3491 |
|
---|
3492 | 7.9 Adding a header to multiple files
|
---|
3493 | =====================================
|
---|
3494 |
|
---|
3495 | GNU âsedâ can be used to safely modify multiple files at once.
|
---|
3496 |
|
---|
3497 | Add a single line to the beginning of source code files:
|
---|
3498 |
|
---|
3499 | sed -i '1i/* Copyright (C) FOO BAR */' *.c
|
---|
3500 |
|
---|
3501 | Adding a few lines is possible using â\nâ in the text:
|
---|
3502 |
|
---|
3503 | sed -i '1i/*\n * Copyright (C) FOO BAR\n * Created by Jane Doe\n */' *.c
|
---|
3504 |
|
---|
3505 | To add multiple lines from another file, use â0rFILEâ. A typical use
|
---|
3506 | case is adding a license notice header to all files:
|
---|
3507 |
|
---|
3508 | ## Create the header file:
|
---|
3509 | $ cat<<'EOF'>LIC.TXT
|
---|
3510 | /*
|
---|
3511 | Copyright (C) 1989-2021 FOO BAR
|
---|
3512 |
|
---|
3513 | This program is free software; you can redistribute it and/or modify
|
---|
3514 | it under the terms of the GNU General Public License as published by
|
---|
3515 | the Free Software Foundation; either version 3, or (at your option)
|
---|
3516 | any later version.
|
---|
3517 |
|
---|
3518 | This program is distributed in the hope that it will be useful,
|
---|
3519 | but WITHOUT ANY WARRANTY; without even the implied warranty of
|
---|
3520 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
---|
3521 | GNU General Public License for more details.
|
---|
3522 |
|
---|
3523 | You should have received a copy of the GNU General Public License
|
---|
3524 | along with this program; If not, see <https://www.gnu.org/licenses/>.
|
---|
3525 | */
|
---|
3526 | EOF
|
---|
3527 |
|
---|
3528 | ## Add the file at the beginning of all source code files:
|
---|
3529 | $ sed -i '0rLIC.TXT' *.cpp *.h
|
---|
3530 |
|
---|
3531 | With script files (e.g. â.shâ,â.pyâ,â.plâ files) the license notice
|
---|
3532 | typically appears _after_ the first line (the âshebangâ â#!â line). The
|
---|
3533 | â1rFILEâ command will add âFILEâ _after_ the first line:
|
---|
3534 |
|
---|
3535 | ## Create the header file:
|
---|
3536 | $ cat<<'EOF'>LIC.TXT
|
---|
3537 | ##
|
---|
3538 | ## Copyright (C) 1989-2021 FOO BAR
|
---|
3539 | ##
|
---|
3540 | ## This program is free software; you can redistribute it and/or modify
|
---|
3541 | ## it under the terms of the GNU General Public License as published by
|
---|
3542 | ## the Free Software Foundation; either version 3, or (at your option)
|
---|
3543 | ## any later version.
|
---|
3544 | ##
|
---|
3545 | ## This program is distributed in the hope that it will be useful,
|
---|
3546 | ## but WITHOUT ANY WARRANTY; without even the implied warranty of
|
---|
3547 | ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
|
---|
3548 | ## GNU General Public License for more details.
|
---|
3549 | ##
|
---|
3550 | ## You should have received a copy of the GNU General Public License
|
---|
3551 | ## along with this program; If not, see <https://www.gnu.org/licenses/>.
|
---|
3552 | ##
|
---|
3553 | ##
|
---|
3554 | EOF
|
---|
3555 |
|
---|
3556 | ## Add the file at the beginning of all source code files:
|
---|
3557 | $ sed -i '1rLIC.TXT' *.py *.sh
|
---|
3558 |
|
---|
3559 | The above âsedâ commands can be combined with âfindâ to locate files
|
---|
3560 | in all subdirectories, âxargsâ to run additional commands on selected
|
---|
3561 | files and âgrepâ to filter out files that already contain a copyright
|
---|
3562 | notice:
|
---|
3563 |
|
---|
3564 | find \( -iname '*.cpp' -o -iname '*.c' -o -iname '*.h' \) \
|
---|
3565 | | xargs grep -Li copyright \
|
---|
3566 | | xargs -r sed -i '0rLIC.TXT'
|
---|
3567 |
|
---|
3568 | Or a slightly safe version (handling files with spaces and newlines):
|
---|
3569 |
|
---|
3570 | find \( -iname '*.cpp' -o -iname '*.c' -o -iname '*.h' \) -print0 \
|
---|
3571 | | xargs -0 grep -Z -Li copyright \
|
---|
3572 | | xargs -0 -r sed -i '0rLIC.TXT'
|
---|
3573 |
|
---|
3574 | Note: using the â0â address with ârâ command requires GNU âsedâ
|
---|
3575 | version 4.9 or later. *Note Zero Address::.
|
---|
3576 |
|
---|
3577 |
|
---|
3578 | File: sed.info, Node: tac, Next: cat -n, Prev: Adding a header to multiple files, Up: Examples
|
---|
3579 |
|
---|
3580 | 7.10 Reverse Lines of Files
|
---|
3581 | ===========================
|
---|
3582 |
|
---|
3583 | This one begins a series of totally useless (yet interesting) scripts
|
---|
3584 | emulating various Unix commands. This, in particular, is a âtacâ
|
---|
3585 | workalike.
|
---|
3586 |
|
---|
3587 | Note that on implementations other than GNU âsedâ this script might
|
---|
3588 | easily overflow internal buffers.
|
---|
3589 |
|
---|
3590 | #!/usr/bin/sed -nf
|
---|
3591 |
|
---|
3592 | # reverse all lines of input, i.e. first line became last, ...
|
---|
3593 |
|
---|
3594 | # from the second line, the buffer (which contains all previous lines)
|
---|
3595 | # is *appended* to current line, so, the order will be reversed
|
---|
3596 | 1! G
|
---|
3597 |
|
---|
3598 | # on the last line we're done -- print everything
|
---|
3599 | $ p
|
---|
3600 |
|
---|
3601 | # store everything on the buffer again
|
---|
3602 | h
|
---|
3603 |
|
---|
3604 |
|
---|
3605 | File: sed.info, Node: cat -n, Next: cat -b, Prev: tac, Up: Examples
|
---|
3606 |
|
---|
3607 | 7.11 Numbering Lines
|
---|
3608 | ====================
|
---|
3609 |
|
---|
3610 | This script replaces âcat -nâ; in fact it formats its output exactly
|
---|
3611 | like GNU âcatâ does.
|
---|
3612 |
|
---|
3613 | Of course this is completely useless and for two reasons: first,
|
---|
3614 | because somebody else did it in C, second, because the following
|
---|
3615 | Bourne-shell script could be used for the same purpose and would be much
|
---|
3616 | faster:
|
---|
3617 |
|
---|
3618 | #! /bin/sh
|
---|
3619 | sed -e "=" $@ | sed -e '
|
---|
3620 | s/^/ /
|
---|
3621 | N
|
---|
3622 | s/^ *\(......\)\n/\1 /
|
---|
3623 | '
|
---|
3624 |
|
---|
3625 | It uses âsedâ to print the line number, then groups lines two by two
|
---|
3626 | using âNâ. Of course, this script does not teach as much as the one
|
---|
3627 | presented below.
|
---|
3628 |
|
---|
3629 | The algorithm used for incrementing uses both buffers, so the line is
|
---|
3630 | printed as soon as possible and then discarded. The number is split so
|
---|
3631 | that changing digits go in a buffer and unchanged ones go in the other;
|
---|
3632 | the changed digits are modified in a single step (using a âyâ command).
|
---|
3633 | The line number for the next line is then composed and stored in the
|
---|
3634 | hold space, to be used in the next iteration.
|
---|
3635 |
|
---|
3636 | #!/usr/bin/sed -nf
|
---|
3637 |
|
---|
3638 | # Prime the pump on the first line
|
---|
3639 | x
|
---|
3640 | /^$/ s/^.*$/1/
|
---|
3641 |
|
---|
3642 | # Add the correct line number before the pattern
|
---|
3643 | G
|
---|
3644 | h
|
---|
3645 |
|
---|
3646 | # Format it and print it
|
---|
3647 | s/^/ /
|
---|
3648 | s/^ *\(......\)\n/\1 /p
|
---|
3649 |
|
---|
3650 | # Get the line number from hold space; add a zero
|
---|
3651 | # if we're going to add a digit on the next line
|
---|
3652 | g
|
---|
3653 | s/\n.*$//
|
---|
3654 | /^9*$/ s/^/0/
|
---|
3655 |
|
---|
3656 | # separate changing/unchanged digits with an x
|
---|
3657 | s/.9*$/x&/
|
---|
3658 |
|
---|
3659 | # keep changing digits in hold space
|
---|
3660 | h
|
---|
3661 | s/^.*x//
|
---|
3662 | y/0123456789/1234567890/
|
---|
3663 | x
|
---|
3664 |
|
---|
3665 | # keep unchanged digits in pattern space
|
---|
3666 | s/x.*$//
|
---|
3667 |
|
---|
3668 | # compose the new number, remove the newline implicitly added by G
|
---|
3669 | G
|
---|
3670 | s/\n//
|
---|
3671 | h
|
---|
3672 |
|
---|
3673 |
|
---|
3674 | File: sed.info, Node: cat -b, Next: wc -c, Prev: cat -n, Up: Examples
|
---|
3675 |
|
---|
3676 | 7.12 Numbering Non-blank Lines
|
---|
3677 | ==============================
|
---|
3678 |
|
---|
3679 | Emulating âcat -bâ is almost the same as âcat -nââwe only have to select
|
---|
3680 | which lines are to be numbered and which are not.
|
---|
3681 |
|
---|
3682 | The part that is common to this script and the previous one is not
|
---|
3683 | commented to show how important it is to comment âsedâ scripts
|
---|
3684 | properly...
|
---|
3685 |
|
---|
3686 | #!/usr/bin/sed -nf
|
---|
3687 |
|
---|
3688 | /^$/ {
|
---|
3689 | p
|
---|
3690 | b
|
---|
3691 | }
|
---|
3692 |
|
---|
3693 | # Same as cat -n from now
|
---|
3694 | x
|
---|
3695 | /^$/ s/^.*$/1/
|
---|
3696 | G
|
---|
3697 | h
|
---|
3698 | s/^/ /
|
---|
3699 | s/^ *\(......\)\n/\1 /p
|
---|
3700 | x
|
---|
3701 | s/\n.*$//
|
---|
3702 | /^9*$/ s/^/0/
|
---|
3703 | s/.9*$/x&/
|
---|
3704 | h
|
---|
3705 | s/^.*x//
|
---|
3706 | y/0123456789/1234567890/
|
---|
3707 | x
|
---|
3708 | s/x.*$//
|
---|
3709 | G
|
---|
3710 | s/\n//
|
---|
3711 | h
|
---|
3712 |
|
---|
3713 |
|
---|
3714 | File: sed.info, Node: wc -c, Next: wc -w, Prev: cat -b, Up: Examples
|
---|
3715 |
|
---|
3716 | 7.13 Counting Characters
|
---|
3717 | ========================
|
---|
3718 |
|
---|
3719 | This script shows another way to do arithmetic with âsedâ. In this case
|
---|
3720 | we have to add possibly large numbers, so implementing this by
|
---|
3721 | successive increments would not be feasible (and possibly even more
|
---|
3722 | complicated to contrive than this script).
|
---|
3723 |
|
---|
3724 | The approach is to map numbers to letters, kind of an abacus
|
---|
3725 | implemented with âsedâ. âaâs are units, âbâs are tens and so on: we
|
---|
3726 | simply add the number of characters on the current line as units, and
|
---|
3727 | then propagate the carry to tens, hundreds, and so on.
|
---|
3728 |
|
---|
3729 | As usual, running totals are kept in hold space.
|
---|
3730 |
|
---|
3731 | On the last line, we convert the abacus form back to decimal. For
|
---|
3732 | the sake of variety, this is done with a loop rather than with some 80
|
---|
3733 | âsâ commands(1): first we convert units, removing âaâs from the number;
|
---|
3734 | then we rotate letters so that tens become âaâs, and so on until no more
|
---|
3735 | letters remain.
|
---|
3736 |
|
---|
3737 | #!/usr/bin/sed -nf
|
---|
3738 |
|
---|
3739 | # Add n+1 a's to hold space (+1 is for the newline)
|
---|
3740 | s/./a/g
|
---|
3741 | H
|
---|
3742 | x
|
---|
3743 | s/\n/a/
|
---|
3744 |
|
---|
3745 | # Do the carry. The t's and b's are not necessary,
|
---|
3746 | # but they do speed up the thing
|
---|
3747 | t a
|
---|
3748 | : a; s/aaaaaaaaaa/b/g; t b; b done
|
---|
3749 | : b; s/bbbbbbbbbb/c/g; t c; b done
|
---|
3750 | : c; s/cccccccccc/d/g; t d; b done
|
---|
3751 | : d; s/dddddddddd/e/g; t e; b done
|
---|
3752 | : e; s/eeeeeeeeee/f/g; t f; b done
|
---|
3753 | : f; s/ffffffffff/g/g; t g; b done
|
---|
3754 | : g; s/gggggggggg/h/g; t h; b done
|
---|
3755 | : h; s/hhhhhhhhhh//g
|
---|
3756 |
|
---|
3757 | : done
|
---|
3758 | $! {
|
---|
3759 | h
|
---|
3760 | b
|
---|
3761 | }
|
---|
3762 |
|
---|
3763 | # On the last line, convert back to decimal
|
---|
3764 |
|
---|
3765 | : loop
|
---|
3766 | /a/! s/[b-h]*/&0/
|
---|
3767 | s/aaaaaaaaa/9/
|
---|
3768 | s/aaaaaaaa/8/
|
---|
3769 | s/aaaaaaa/7/
|
---|
3770 | s/aaaaaa/6/
|
---|
3771 | s/aaaaa/5/
|
---|
3772 | s/aaaa/4/
|
---|
3773 | s/aaa/3/
|
---|
3774 | s/aa/2/
|
---|
3775 | s/a/1/
|
---|
3776 |
|
---|
3777 | : next
|
---|
3778 | y/bcdefgh/abcdefg/
|
---|
3779 | /[a-h]/ b loop
|
---|
3780 | p
|
---|
3781 |
|
---|
3782 | ---------- Footnotes ----------
|
---|
3783 |
|
---|
3784 | (1) Some implementations have a limit of 199 commands per script
|
---|
3785 |
|
---|
3786 |
|
---|
3787 | File: sed.info, Node: wc -w, Next: wc -l, Prev: wc -c, Up: Examples
|
---|
3788 |
|
---|
3789 | 7.14 Counting Words
|
---|
3790 | ===================
|
---|
3791 |
|
---|
3792 | This script is almost the same as the previous one, once each of the
|
---|
3793 | words on the line is converted to a single âaâ (in the previous script
|
---|
3794 | each letter was changed to an âaâ).
|
---|
3795 |
|
---|
3796 | It is interesting that real âwcâ programs have optimized loops for
|
---|
3797 | âwc -câ, so they are much slower at counting words rather than
|
---|
3798 | characters. This scriptâs bottleneck, instead, is arithmetic, and hence
|
---|
3799 | the word-counting one is faster (it has to manage smaller numbers).
|
---|
3800 |
|
---|
3801 | Again, the common parts are not commented to show the importance of
|
---|
3802 | commenting âsedâ scripts.
|
---|
3803 |
|
---|
3804 | #!/usr/bin/sed -nf
|
---|
3805 |
|
---|
3806 | # Convert words to a's
|
---|
3807 | s/[ <TAB>][ <TAB>]*/ /g
|
---|
3808 | s/^/ /
|
---|
3809 | s/ [^ ][^ ]*/a /g
|
---|
3810 | s/ //g
|
---|
3811 |
|
---|
3812 | # Append them to hold space
|
---|
3813 | H
|
---|
3814 | x
|
---|
3815 | s/\n//
|
---|
3816 |
|
---|
3817 | # From here on it is the same as in wc -c.
|
---|
3818 | /aaaaaaaaaa/! bx; s/aaaaaaaaaa/b/g
|
---|
3819 | /bbbbbbbbbb/! bx; s/bbbbbbbbbb/c/g
|
---|
3820 | /cccccccccc/! bx; s/cccccccccc/d/g
|
---|
3821 | /dddddddddd/! bx; s/dddddddddd/e/g
|
---|
3822 | /eeeeeeeeee/! bx; s/eeeeeeeeee/f/g
|
---|
3823 | /ffffffffff/! bx; s/ffffffffff/g/g
|
---|
3824 | /gggggggggg/! bx; s/gggggggggg/h/g
|
---|
3825 | s/hhhhhhhhhh//g
|
---|
3826 | :x
|
---|
3827 | $! { h; b; }
|
---|
3828 | :y
|
---|
3829 | /a/! s/[b-h]*/&0/
|
---|
3830 | s/aaaaaaaaa/9/
|
---|
3831 | s/aaaaaaaa/8/
|
---|
3832 | s/aaaaaaa/7/
|
---|
3833 | s/aaaaaa/6/
|
---|
3834 | s/aaaaa/5/
|
---|
3835 | s/aaaa/4/
|
---|
3836 | s/aaa/3/
|
---|
3837 | s/aa/2/
|
---|
3838 | s/a/1/
|
---|
3839 | y/bcdefgh/abcdefg/
|
---|
3840 | /[a-h]/ by
|
---|
3841 | p
|
---|
3842 |
|
---|
3843 |
|
---|
3844 | File: sed.info, Node: wc -l, Next: head, Prev: wc -w, Up: Examples
|
---|
3845 |
|
---|
3846 | 7.15 Counting Lines
|
---|
3847 | ===================
|
---|
3848 |
|
---|
3849 | No strange things are done now, because âsedâ gives us âwc -lâ
|
---|
3850 | functionality for free!!! Look:
|
---|
3851 |
|
---|
3852 | #!/usr/bin/sed -nf
|
---|
3853 | $=
|
---|
3854 |
|
---|
3855 |
|
---|
3856 | File: sed.info, Node: head, Next: tail, Prev: wc -l, Up: Examples
|
---|
3857 |
|
---|
3858 | 7.16 Printing the First Lines
|
---|
3859 | =============================
|
---|
3860 |
|
---|
3861 | This script is probably the simplest useful âsedâ script. It displays
|
---|
3862 | the first 10 lines of input; the number of displayed lines is right
|
---|
3863 | before the âqâ command.
|
---|
3864 |
|
---|
3865 | #!/usr/bin/sed -f
|
---|
3866 | 10q
|
---|
3867 |
|
---|
3868 |
|
---|
3869 | File: sed.info, Node: tail, Next: uniq, Prev: head, Up: Examples
|
---|
3870 |
|
---|
3871 | 7.17 Printing the Last Lines
|
---|
3872 | ============================
|
---|
3873 |
|
---|
3874 | Printing the last N lines rather than the first is more complex but
|
---|
3875 | indeed possible. N is encoded in the second line, before the bang
|
---|
3876 | character.
|
---|
3877 |
|
---|
3878 | This script is similar to the âtacâ script in that it keeps the final
|
---|
3879 | output in the hold space and prints it at the end:
|
---|
3880 |
|
---|
3881 | #!/usr/bin/sed -nf
|
---|
3882 |
|
---|
3883 | 1! {; H; g; }
|
---|
3884 | 1,10 !s/[^\n]*\n//
|
---|
3885 | $p
|
---|
3886 | h
|
---|
3887 |
|
---|
3888 | Mainly, the scripts keeps a window of 10 lines and slides it by
|
---|
3889 | adding a line and deleting the oldest (the substitution command on the
|
---|
3890 | second line works like a âDâ command but does not restart the loop).
|
---|
3891 |
|
---|
3892 | The âsliding windowâ technique is a very powerful way to write
|
---|
3893 | efficient and complex âsedâ scripts, because commands like âPâ would
|
---|
3894 | require a lot of work if implemented manually.
|
---|
3895 |
|
---|
3896 | To introduce the technique, which is fully demonstrated in the rest
|
---|
3897 | of this chapter and is based on the âNâ, âPâ and âDâ commands, here is
|
---|
3898 | an implementation of âtailâ using a simple âsliding window.â
|
---|
3899 |
|
---|
3900 | This looks complicated but in fact the working is the same as the
|
---|
3901 | last script: after we have kicked in the appropriate number of lines,
|
---|
3902 | however, we stop using the hold space to keep inter-line state, and
|
---|
3903 | instead use âNâ and âDâ to slide pattern space by one line:
|
---|
3904 |
|
---|
3905 | #!/usr/bin/sed -f
|
---|
3906 |
|
---|
3907 | 1h
|
---|
3908 | 2,10 {; H; g; }
|
---|
3909 | $q
|
---|
3910 | 1,9d
|
---|
3911 | N
|
---|
3912 | D
|
---|
3913 |
|
---|
3914 | Note how the first, second and fourth line are inactive after the
|
---|
3915 | first ten lines of input. After that, all the script does is: exiting
|
---|
3916 | on the last line of input, appending the next input line to pattern
|
---|
3917 | space, and removing the first line.
|
---|
3918 |
|
---|
3919 |
|
---|
3920 | File: sed.info, Node: uniq, Next: uniq -d, Prev: tail, Up: Examples
|
---|
3921 |
|
---|
3922 | 7.18 Make Duplicate Lines Unique
|
---|
3923 | ================================
|
---|
3924 |
|
---|
3925 | This is an example of the art of using the âNâ, âPâ and âDâ commands,
|
---|
3926 | probably the most difficult to master.
|
---|
3927 |
|
---|
3928 | #!/usr/bin/sed -f
|
---|
3929 | h
|
---|
3930 |
|
---|
3931 | :b
|
---|
3932 | # On the last line, print and exit
|
---|
3933 | $b
|
---|
3934 | N
|
---|
3935 | /^\(.*\)\n\1$/ {
|
---|
3936 | # The two lines are identical. Undo the effect of
|
---|
3937 | # the n command.
|
---|
3938 | g
|
---|
3939 | bb
|
---|
3940 | }
|
---|
3941 |
|
---|
3942 | # If the N command had added the last line, print and exit
|
---|
3943 | $b
|
---|
3944 |
|
---|
3945 | # The lines are different; print the first and go
|
---|
3946 | # back working on the second.
|
---|
3947 | P
|
---|
3948 | D
|
---|
3949 |
|
---|
3950 | As you can see, we maintain a 2-line window using âPâ and âDâ. This
|
---|
3951 | technique is often used in advanced âsedâ scripts.
|
---|
3952 |
|
---|
3953 |
|
---|
3954 | File: sed.info, Node: uniq -d, Next: uniq -u, Prev: uniq, Up: Examples
|
---|
3955 |
|
---|
3956 | 7.19 Print Duplicated Lines of Input
|
---|
3957 | ====================================
|
---|
3958 |
|
---|
3959 | This script prints only duplicated lines, like âuniq -dâ.
|
---|
3960 |
|
---|
3961 | #!/usr/bin/sed -nf
|
---|
3962 |
|
---|
3963 | $b
|
---|
3964 | N
|
---|
3965 | /^\(.*\)\n\1$/ {
|
---|
3966 | # Print the first of the duplicated lines
|
---|
3967 | s/.*\n//
|
---|
3968 | p
|
---|
3969 |
|
---|
3970 | # Loop until we get a different line
|
---|
3971 | :b
|
---|
3972 | $b
|
---|
3973 | N
|
---|
3974 | /^\(.*\)\n\1$/ {
|
---|
3975 | s/.*\n//
|
---|
3976 | bb
|
---|
3977 | }
|
---|
3978 | }
|
---|
3979 |
|
---|
3980 | # The last line cannot be followed by duplicates
|
---|
3981 | $b
|
---|
3982 |
|
---|
3983 | # Found a different one. Leave it alone in the pattern space
|
---|
3984 | # and go back to the top, hunting its duplicates
|
---|
3985 | D
|
---|
3986 |
|
---|
3987 |
|
---|
3988 | File: sed.info, Node: uniq -u, Next: cat -s, Prev: uniq -d, Up: Examples
|
---|
3989 |
|
---|
3990 | 7.20 Remove All Duplicated Lines
|
---|
3991 | ================================
|
---|
3992 |
|
---|
3993 | This script prints only unique lines, like âuniq -uâ.
|
---|
3994 |
|
---|
3995 | #!/usr/bin/sed -f
|
---|
3996 |
|
---|
3997 | # Search for a duplicate line --- until that, print what you find.
|
---|
3998 | $b
|
---|
3999 | N
|
---|
4000 | /^\(.*\)\n\1$/ ! {
|
---|
4001 | P
|
---|
4002 | D
|
---|
4003 | }
|
---|
4004 |
|
---|
4005 | :c
|
---|
4006 | # Got two equal lines in pattern space. At the
|
---|
4007 | # end of the file we simply exit
|
---|
4008 | $d
|
---|
4009 |
|
---|
4010 | # Else, we keep reading lines with N until we
|
---|
4011 | # find a different one
|
---|
4012 | s/.*\n//
|
---|
4013 | N
|
---|
4014 | /^\(.*\)\n\1$/ {
|
---|
4015 | bc
|
---|
4016 | }
|
---|
4017 |
|
---|
4018 | # Remove the last instance of the duplicate line
|
---|
4019 | # and go back to the top
|
---|
4020 | D
|
---|
4021 |
|
---|
4022 |
|
---|
4023 | File: sed.info, Node: cat -s, Prev: uniq -u, Up: Examples
|
---|
4024 |
|
---|
4025 | 7.21 Squeezing Blank Lines
|
---|
4026 | ==========================
|
---|
4027 |
|
---|
4028 | As a final example, here are three scripts, of increasing complexity and
|
---|
4029 | speed, that implement the same function as âcat -sâ, that is squeezing
|
---|
4030 | blank lines.
|
---|
4031 |
|
---|
4032 | The first leaves a blank line at the beginning and end if there are
|
---|
4033 | some already.
|
---|
4034 |
|
---|
4035 | #!/usr/bin/sed -f
|
---|
4036 |
|
---|
4037 | # on empty lines, join with next
|
---|
4038 | # Note there is a star in the regexp
|
---|
4039 | :x
|
---|
4040 | /^\n*$/ {
|
---|
4041 | N
|
---|
4042 | bx
|
---|
4043 | }
|
---|
4044 |
|
---|
4045 | # now, squeeze all '\n', this can be also done by:
|
---|
4046 | # s/^\(\n\)*/\1/
|
---|
4047 | s/\n*/\
|
---|
4048 | /
|
---|
4049 |
|
---|
4050 | This one is a bit more complex and removes all empty lines at the
|
---|
4051 | beginning. It does leave a single blank line at end if one was there.
|
---|
4052 |
|
---|
4053 | #!/usr/bin/sed -f
|
---|
4054 |
|
---|
4055 | # delete all leading empty lines
|
---|
4056 | 1,/^./{
|
---|
4057 | /./!d
|
---|
4058 | }
|
---|
4059 |
|
---|
4060 | # on an empty line we remove it and all the following
|
---|
4061 | # empty lines, but one
|
---|
4062 | :x
|
---|
4063 | /./!{
|
---|
4064 | N
|
---|
4065 | s/^\n$//
|
---|
4066 | tx
|
---|
4067 | }
|
---|
4068 |
|
---|
4069 | This removes leading and trailing blank lines. It is also the
|
---|
4070 | fastest. Note that loops are completely done with ânâ and âbâ, without
|
---|
4071 | relying on âsedâ to restart the script automatically at the end of a
|
---|
4072 | line.
|
---|
4073 |
|
---|
4074 | #!/usr/bin/sed -nf
|
---|
4075 |
|
---|
4076 | # delete all (leading) blanks
|
---|
4077 | /./!d
|
---|
4078 |
|
---|
4079 | # get here: so there is a non empty
|
---|
4080 | :x
|
---|
4081 | # print it
|
---|
4082 | p
|
---|
4083 | # get next
|
---|
4084 | n
|
---|
4085 | # got chars? print it again, etc...
|
---|
4086 | /./bx
|
---|
4087 |
|
---|
4088 | # no, don't have chars: got an empty line
|
---|
4089 | :z
|
---|
4090 | # get next, if last line we finish here so no trailing
|
---|
4091 | # empty lines are written
|
---|
4092 | n
|
---|
4093 | # also empty? then ignore it, and get next... this will
|
---|
4094 | # remove ALL empty lines
|
---|
4095 | /./!bz
|
---|
4096 |
|
---|
4097 | # all empty lines were deleted/ignored, but we have a non empty. As
|
---|
4098 | # what we want to do is to squeeze, insert a blank line artificially
|
---|
4099 | i\
|
---|
4100 |
|
---|
4101 | bx
|
---|
4102 |
|
---|
4103 |
|
---|
4104 | File: sed.info, Node: Limitations, Next: Other Resources, Prev: Examples, Up: Top
|
---|
4105 |
|
---|
4106 | 8 GNU âsedââs Limitations and Non-limitations
|
---|
4107 | *********************************************
|
---|
4108 |
|
---|
4109 | For those who want to write portable âsedâ scripts, be aware that some
|
---|
4110 | implementations have been known to limit line lengths (for the pattern
|
---|
4111 | and hold spaces) to be no more than 4000 bytes. The POSIX standard
|
---|
4112 | specifies that conforming âsedâ implementations shall support at least
|
---|
4113 | 8192 byte line lengths. GNU âsedâ has no built-in limit on line length;
|
---|
4114 | as long as it can âmalloc()â more (virtual) memory, you can feed or
|
---|
4115 | construct lines as long as you like.
|
---|
4116 |
|
---|
4117 | However, recursion is used to handle subpatterns and indefinite
|
---|
4118 | repetition. This means that the available stack space may limit the
|
---|
4119 | size of the buffer that can be processed by certain patterns.
|
---|
4120 |
|
---|
4121 |
|
---|
4122 | File: sed.info, Node: Other Resources, Next: Reporting Bugs, Prev: Limitations, Up: Top
|
---|
4123 |
|
---|
4124 | 9 Other Resources for Learning About âsedâ
|
---|
4125 | ******************************************
|
---|
4126 |
|
---|
4127 | For up to date information about GNU âsedâ please visit
|
---|
4128 | <https://www.gnu.org/software/sed/>.
|
---|
4129 |
|
---|
4130 | Send general questions and suggestions to <sed-devel@gnu.org>. Visit
|
---|
4131 | the mailing list archives for past discussions at
|
---|
4132 | <https://lists.gnu.org/archive/html/sed-devel/>.
|
---|
4133 |
|
---|
4134 | The following resources provide information about âsedâ (both GNU
|
---|
4135 | âsedâ and other variations). Note these not maintained by GNU âsedâ
|
---|
4136 | developers.
|
---|
4137 |
|
---|
4138 | ⢠sed â$HOMEâ: <http://sed.sf.net>
|
---|
4139 |
|
---|
4140 | ⢠sed FAQ: <http://sed.sf.net/sedfaq.html>
|
---|
4141 |
|
---|
4142 | ⢠sederâs grabbag: <http://sed.sf.net/grabbag>
|
---|
4143 |
|
---|
4144 | ⢠The âsed-usersâ mailing list maintained by Sven Guckes:
|
---|
4145 | <http://groups.yahoo.com/group/sed-users/> (note this is _not_ the
|
---|
4146 | GNU âsedâ mailing list).
|
---|
4147 |
|
---|
4148 |
|
---|
4149 | File: sed.info, Node: Reporting Bugs, Next: GNU Free Documentation License, Prev: Other Resources, Up: Top
|
---|
4150 |
|
---|
4151 | 10 Reporting Bugs
|
---|
4152 | *****************
|
---|
4153 |
|
---|
4154 | Email bug reports to <bug-sed@gnu.org>. Also, please include the output
|
---|
4155 | of âsed --versionâ in the body of your report if at all possible.
|
---|
4156 |
|
---|
4157 | Please do not send a bug report like this:
|
---|
4158 |
|
---|
4159 | while building frobme-1.3.4
|
---|
4160 | $ configure
|
---|
4161 | errorâ sed: file sedscr line 1: Unknown option to 's'
|
---|
4162 |
|
---|
4163 | If GNU âsedâ doesnât configure your favorite package, take a few
|
---|
4164 | extra minutes to identify the specific problem and make a stand-alone
|
---|
4165 | test case. Unlike other programs such as C compilers, making such test
|
---|
4166 | cases for âsedâ is quite simple.
|
---|
4167 |
|
---|
4168 | A stand-alone test case includes all the data necessary to perform
|
---|
4169 | the test, and the specific invocation of âsedâ that causes the problem.
|
---|
4170 | The smaller a stand-alone test case is, the better. A test case should
|
---|
4171 | not involve something as far removed from âsedâ as âtry to configure
|
---|
4172 | frobme-1.3.4â. Yes, that is in principle enough information to look for
|
---|
4173 | the bug, but that is not a very practical prospect.
|
---|
4174 |
|
---|
4175 | Here are a few commonly reported bugs that are not bugs.
|
---|
4176 |
|
---|
4177 | âNâ command on the last line
|
---|
4178 |
|
---|
4179 | Most versions of âsedâ exit without printing anything when the âNâ
|
---|
4180 | command is issued on the last line of a file. GNU âsedâ prints
|
---|
4181 | pattern space before exiting unless of course the â-nâ command
|
---|
4182 | switch has been specified. This choice is by design.
|
---|
4183 |
|
---|
4184 | Default behavior (gnu extension, non-POSIX conforming):
|
---|
4185 | $ seq 3 | sed N
|
---|
4186 | 1
|
---|
4187 | 2
|
---|
4188 | 3
|
---|
4189 | To force POSIX-conforming behavior:
|
---|
4190 | $ seq 3 | sed --posix N
|
---|
4191 | 1
|
---|
4192 | 2
|
---|
4193 |
|
---|
4194 | For example, the behavior of
|
---|
4195 | sed N foo bar
|
---|
4196 | would depend on whether foo has an even or an odd number of
|
---|
4197 | lines(1). Or, when writing a script to read the next few lines
|
---|
4198 | following a pattern match, traditional implementations of âsedâ
|
---|
4199 | would force you to write something like
|
---|
4200 | /foo/{ $!N; $!N; $!N; $!N; $!N; $!N; $!N; $!N; $!N }
|
---|
4201 | instead of just
|
---|
4202 | /foo/{ N;N;N;N;N;N;N;N;N; }
|
---|
4203 |
|
---|
4204 | In any case, the simplest workaround is to use â$d;Nâ in scripts
|
---|
4205 | that rely on the traditional behavior, or to set the
|
---|
4206 | âPOSIXLY_CORRECTâ variable to a non-empty value.
|
---|
4207 |
|
---|
4208 | Regex syntax clashes (problems with backslashes)
|
---|
4209 | âsedâ uses the POSIX basic regular expression syntax. According to
|
---|
4210 | the standard, the meaning of some escape sequences is undefined in
|
---|
4211 | this syntax; notable in the case of âsedâ are â\|â, â\+â, â\?â,
|
---|
4212 | â\`â, â\'â, â\<â, â\>â, â\bâ, â\Bâ, â\wâ, and â\Wâ.
|
---|
4213 |
|
---|
4214 | As in all GNU programs that use POSIX basic regular expressions,
|
---|
4215 | âsedâ interprets these escape sequences as special characters. So,
|
---|
4216 | âx\+â matches one or more occurrences of âxâ. âabc\|defâ matches
|
---|
4217 | either âabcâ or âdefâ.
|
---|
4218 |
|
---|
4219 | This syntax may cause problems when running scripts written for
|
---|
4220 | other âsedâs. Some âsedâ programs have been written with the
|
---|
4221 | assumption that â\|â and â\+â match the literal characters â|â and
|
---|
4222 | â+â. Such scripts must be modified by removing the spurious
|
---|
4223 | backslashes if they are to be used with modern implementations of
|
---|
4224 | âsedâ, like GNU âsedâ.
|
---|
4225 |
|
---|
4226 | On the other hand, some scripts use s|abc\|def||g to remove
|
---|
4227 | occurrences of _either_ âabcâ or âdefâ. While this worked until
|
---|
4228 | âsedâ 4.0.x, newer versions interpret this as removing the string
|
---|
4229 | âabc|defâ. This is again undefined behavior according to POSIX,
|
---|
4230 | and this interpretation is arguably more robust: older âsedâs, for
|
---|
4231 | example, required that the regex matcher parsed â\/â as â/â in the
|
---|
4232 | common case of escaping a slash, which is again undefined behavior;
|
---|
4233 | the new behavior avoids this, and this is good because the regex
|
---|
4234 | matcher is only partially under our control.
|
---|
4235 |
|
---|
4236 | In addition, this version of âsedâ supports several escape
|
---|
4237 | characters (some of which are multi-character) to insert
|
---|
4238 | non-printable characters in scripts (â\aâ, â\câ, â\dâ, â\oâ, â\râ,
|
---|
4239 | â\tâ, â\vâ, â\xâ). These can cause similar problems with scripts
|
---|
4240 | written for other âsedâs.
|
---|
4241 |
|
---|
4242 | â-iâ clobbers read-only files
|
---|
4243 |
|
---|
4244 | In short, âsed -iâ will let you delete the contents of a read-only
|
---|
4245 | file, and in general the â-iâ option (*note Invocation: Invoking
|
---|
4246 | sed.) lets you clobber protected files. This is not a bug, but
|
---|
4247 | rather a consequence of how the Unix file system works.
|
---|
4248 |
|
---|
4249 | The permissions on a file say what can happen to the data in that
|
---|
4250 | file, while the permissions on a directory say what can happen to
|
---|
4251 | the list of files in that directory. âsed -iâ will not ever open
|
---|
4252 | for writing a file that is already on disk. Rather, it will work
|
---|
4253 | on a temporary file that is finally renamed to the original name:
|
---|
4254 | if you rename or delete files, youâre actually modifying the
|
---|
4255 | contents of the directory, so the operation depends on the
|
---|
4256 | permissions of the directory, not of the file. For this same
|
---|
4257 | reason, âsedâ does not let you use â-iâ on a writable file in a
|
---|
4258 | read-only directory, and will break hard or symbolic links when
|
---|
4259 | â-iâ is used on such a file.
|
---|
4260 |
|
---|
4261 | â0aâ does not work (gives an error)
|
---|
4262 |
|
---|
4263 | There is no line 0. 0 is a special address that is only used to
|
---|
4264 | treat addresses like â0,/RE/â as active when the script starts: if
|
---|
4265 | you write â1,/abc/dâ and the first line includes the string âabcâ,
|
---|
4266 | then that match would be ignored because address ranges must span
|
---|
4267 | at least two lines (barring the end of the file); but what you
|
---|
4268 | probably wanted is to delete every line up to the first one
|
---|
4269 | including âabcâ, and this is obtained with â0,/abc/dâ.
|
---|
4270 |
|
---|
4271 | â[a-z]â is case insensitive
|
---|
4272 |
|
---|
4273 | You are encountering problems with locales. POSIX mandates that
|
---|
4274 | â[a-z]â uses the current localeâs collation order â in C parlance,
|
---|
4275 | that means using âstrcoll(3)â instead of âstrcmp(3)â. Some locales
|
---|
4276 | have a case-insensitive collation order, others donât.
|
---|
4277 |
|
---|
4278 | Another problem is that â[a-z]â tries to use collation symbols.
|
---|
4279 | This only happens if you are on the GNU system, using GNU libcâs
|
---|
4280 | regular expression matcher instead of compiling the one supplied
|
---|
4281 | with GNU sed. In a Danish locale, for example, the regular
|
---|
4282 | expression â^[a-z]$â matches the string âaaâ, because this is a
|
---|
4283 | single collating symbol that comes after âaâ and before âbâ; âllâ
|
---|
4284 | behaves similarly in Spanish locales, or âijâ in Dutch locales.
|
---|
4285 |
|
---|
4286 | To work around these problems, which may cause bugs in shell
|
---|
4287 | scripts, set the âLC_COLLATEâ and âLC_CTYPEâ environment variables
|
---|
4288 | to âCâ.
|
---|
4289 |
|
---|
4290 | âs/.*//â does not clear pattern space
|
---|
4291 |
|
---|
4292 | This happens if your input stream includes invalid multibyte
|
---|
4293 | sequences. POSIX mandates that such sequences are _not_ matched by
|
---|
4294 | â.â, so that âs/.*//â will not clear pattern space as you would
|
---|
4295 | expect. In fact, there is no way to clear sedâs buffers in the
|
---|
4296 | middle of the script in most multibyte locales (including UTF-8
|
---|
4297 | locales). For this reason, GNU âsedâ provides a âzâ command (for
|
---|
4298 | âzapâ) as an extension.
|
---|
4299 |
|
---|
4300 | To work around these problems, which may cause bugs in shell
|
---|
4301 | scripts, set the âLC_COLLATEâ and âLC_CTYPEâ environment variables
|
---|
4302 | to âCâ.
|
---|
4303 |
|
---|
4304 | ---------- Footnotes ----------
|
---|
4305 |
|
---|
4306 | (1) which is the actual âbugâ that prompted the change in behavior
|
---|
4307 |
|
---|
4308 |
|
---|
4309 | File: sed.info, Node: GNU Free Documentation License, Next: Concept Index, Prev: Reporting Bugs, Up: Top
|
---|
4310 |
|
---|
4311 | Appendix A GNU Free Documentation License
|
---|
4312 | *****************************************
|
---|
4313 |
|
---|
4314 | Version 1.3, 3 November 2008
|
---|
4315 |
|
---|
4316 | Copyright © 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
|
---|
4317 | <https://fsf.org/>
|
---|
4318 |
|
---|
4319 | Everyone is permitted to copy and distribute verbatim copies
|
---|
4320 | of this license document, but changing it is not allowed.
|
---|
4321 |
|
---|
4322 | 0. PREAMBLE
|
---|
4323 |
|
---|
4324 | The purpose of this License is to make a manual, textbook, or other
|
---|
4325 | functional and useful document âfreeâ in the sense of freedom: to
|
---|
4326 | assure everyone the effective freedom to copy and redistribute it,
|
---|
4327 | with or without modifying it, either commercially or
|
---|
4328 | noncommercially. Secondarily, this License preserves for the
|
---|
4329 | author and publisher a way to get credit for their work, while not
|
---|
4330 | being considered responsible for modifications made by others.
|
---|
4331 |
|
---|
4332 | This License is a kind of âcopyleftâ, which means that derivative
|
---|
4333 | works of the document must themselves be free in the same sense.
|
---|
4334 | It complements the GNU General Public License, which is a copyleft
|
---|
4335 | license designed for free software.
|
---|
4336 |
|
---|
4337 | We have designed this License in order to use it for manuals for
|
---|
4338 | free software, because free software needs free documentation: a
|
---|
4339 | free program should come with manuals providing the same freedoms
|
---|
4340 | that the software does. But this License is not limited to
|
---|
4341 | software manuals; it can be used for any textual work, regardless
|
---|
4342 | of subject matter or whether it is published as a printed book. We
|
---|
4343 | recommend this License principally for works whose purpose is
|
---|
4344 | instruction or reference.
|
---|
4345 |
|
---|
4346 | 1. APPLICABILITY AND DEFINITIONS
|
---|
4347 |
|
---|
4348 | This License applies to any manual or other work, in any medium,
|
---|
4349 | that contains a notice placed by the copyright holder saying it can
|
---|
4350 | be distributed under the terms of this License. Such a notice
|
---|
4351 | grants a world-wide, royalty-free license, unlimited in duration,
|
---|
4352 | to use that work under the conditions stated herein. The
|
---|
4353 | âDocumentâ, below, refers to any such manual or work. Any member
|
---|
4354 | of the public is a licensee, and is addressed as âyouâ. You accept
|
---|
4355 | the license if you copy, modify or distribute the work in a way
|
---|
4356 | requiring permission under copyright law.
|
---|
4357 |
|
---|
4358 | A âModified Versionâ of the Document means any work containing the
|
---|
4359 | Document or a portion of it, either copied verbatim, or with
|
---|
4360 | modifications and/or translated into another language.
|
---|
4361 |
|
---|
4362 | A âSecondary Sectionâ is a named appendix or a front-matter section
|
---|
4363 | of the Document that deals exclusively with the relationship of the
|
---|
4364 | publishers or authors of the Document to the Documentâs overall
|
---|
4365 | subject (or to related matters) and contains nothing that could
|
---|
4366 | fall directly within that overall subject. (Thus, if the Document
|
---|
4367 | is in part a textbook of mathematics, a Secondary Section may not
|
---|
4368 | explain any mathematics.) The relationship could be a matter of
|
---|
4369 | historical connection with the subject or with related matters, or
|
---|
4370 | of legal, commercial, philosophical, ethical or political position
|
---|
4371 | regarding them.
|
---|
4372 |
|
---|
4373 | The âInvariant Sectionsâ are certain Secondary Sections whose
|
---|
4374 | titles are designated, as being those of Invariant Sections, in the
|
---|
4375 | notice that says that the Document is released under this License.
|
---|
4376 | If a section does not fit the above definition of Secondary then it
|
---|
4377 | is not allowed to be designated as Invariant. The Document may
|
---|
4378 | contain zero Invariant Sections. If the Document does not identify
|
---|
4379 | any Invariant Sections then there are none.
|
---|
4380 |
|
---|
4381 | The âCover Textsâ are certain short passages of text that are
|
---|
4382 | listed, as Front-Cover Texts or Back-Cover Texts, in the notice
|
---|
4383 | that says that the Document is released under this License. A
|
---|
4384 | Front-Cover Text may be at most 5 words, and a Back-Cover Text may
|
---|
4385 | be at most 25 words.
|
---|
4386 |
|
---|
4387 | A âTransparentâ copy of the Document means a machine-readable copy,
|
---|
4388 | represented in a format whose specification is available to the
|
---|
4389 | general public, that is suitable for revising the document
|
---|
4390 | straightforwardly with generic text editors or (for images composed
|
---|
4391 | of pixels) generic paint programs or (for drawings) some widely
|
---|
4392 | available drawing editor, and that is suitable for input to text
|
---|
4393 | formatters or for automatic translation to a variety of formats
|
---|
4394 | suitable for input to text formatters. A copy made in an otherwise
|
---|
4395 | Transparent file format whose markup, or absence of markup, has
|
---|
4396 | been arranged to thwart or discourage subsequent modification by
|
---|
4397 | readers is not Transparent. An image format is not Transparent if
|
---|
4398 | used for any substantial amount of text. A copy that is not
|
---|
4399 | âTransparentâ is called âOpaqueâ.
|
---|
4400 |
|
---|
4401 | Examples of suitable formats for Transparent copies include plain
|
---|
4402 | ASCII without markup, Texinfo input format, LaTeX input format,
|
---|
4403 | SGML or XML using a publicly available DTD, and standard-conforming
|
---|
4404 | simple HTML, PostScript or PDF designed for human modification.
|
---|
4405 | Examples of transparent image formats include PNG, XCF and JPG.
|
---|
4406 | Opaque formats include proprietary formats that can be read and
|
---|
4407 | edited only by proprietary word processors, SGML or XML for which
|
---|
4408 | the DTD and/or processing tools are not generally available, and
|
---|
4409 | the machine-generated HTML, PostScript or PDF produced by some word
|
---|
4410 | processors for output purposes only.
|
---|
4411 |
|
---|
4412 | The âTitle Pageâ means, for a printed book, the title page itself,
|
---|
4413 | plus such following pages as are needed to hold, legibly, the
|
---|
4414 | material this License requires to appear in the title page. For
|
---|
4415 | works in formats which do not have any title page as such, âTitle
|
---|
4416 | Pageâ means the text near the most prominent appearance of the
|
---|
4417 | workâs title, preceding the beginning of the body of the text.
|
---|
4418 |
|
---|
4419 | The âpublisherâ means any person or entity that distributes copies
|
---|
4420 | of the Document to the public.
|
---|
4421 |
|
---|
4422 | A section âEntitled XYZâ means a named subunit of the Document
|
---|
4423 | whose title either is precisely XYZ or contains XYZ in parentheses
|
---|
4424 | following text that translates XYZ in another language. (Here XYZ
|
---|
4425 | stands for a specific section name mentioned below, such as
|
---|
4426 | âAcknowledgementsâ, âDedicationsâ, âEndorsementsâ, or âHistoryâ.)
|
---|
4427 | To âPreserve the Titleâ of such a section when you modify the
|
---|
4428 | Document means that it remains a section âEntitled XYZâ according
|
---|
4429 | to this definition.
|
---|
4430 |
|
---|
4431 | The Document may include Warranty Disclaimers next to the notice
|
---|
4432 | which states that this License applies to the Document. These
|
---|
4433 | Warranty Disclaimers are considered to be included by reference in
|
---|
4434 | this License, but only as regards disclaiming warranties: any other
|
---|
4435 | implication that these Warranty Disclaimers may have is void and
|
---|
4436 | has no effect on the meaning of this License.
|
---|
4437 |
|
---|
4438 | 2. VERBATIM COPYING
|
---|
4439 |
|
---|
4440 | You may copy and distribute the Document in any medium, either
|
---|
4441 | commercially or noncommercially, provided that this License, the
|
---|
4442 | copyright notices, and the license notice saying this License
|
---|
4443 | applies to the Document are reproduced in all copies, and that you
|
---|
4444 | add no other conditions whatsoever to those of this License. You
|
---|
4445 | may not use technical measures to obstruct or control the reading
|
---|
4446 | or further copying of the copies you make or distribute. However,
|
---|
4447 | you may accept compensation in exchange for copies. If you
|
---|
4448 | distribute a large enough number of copies you must also follow the
|
---|
4449 | conditions in section 3.
|
---|
4450 |
|
---|
4451 | You may also lend copies, under the same conditions stated above,
|
---|
4452 | and you may publicly display copies.
|
---|
4453 |
|
---|
4454 | 3. COPYING IN QUANTITY
|
---|
4455 |
|
---|
4456 | If you publish printed copies (or copies in media that commonly
|
---|
4457 | have printed covers) of the Document, numbering more than 100, and
|
---|
4458 | the Documentâs license notice requires Cover Texts, you must
|
---|
4459 | enclose the copies in covers that carry, clearly and legibly, all
|
---|
4460 | these Cover Texts: Front-Cover Texts on the front cover, and
|
---|
4461 | Back-Cover Texts on the back cover. Both covers must also clearly
|
---|
4462 | and legibly identify you as the publisher of these copies. The
|
---|
4463 | front cover must present the full title with all words of the title
|
---|
4464 | equally prominent and visible. You may add other material on the
|
---|
4465 | covers in addition. Copying with changes limited to the covers, as
|
---|
4466 | long as they preserve the title of the Document and satisfy these
|
---|
4467 | conditions, can be treated as verbatim copying in other respects.
|
---|
4468 |
|
---|
4469 | If the required texts for either cover are too voluminous to fit
|
---|
4470 | legibly, you should put the first ones listed (as many as fit
|
---|
4471 | reasonably) on the actual cover, and continue the rest onto
|
---|
4472 | adjacent pages.
|
---|
4473 |
|
---|
4474 | If you publish or distribute Opaque copies of the Document
|
---|
4475 | numbering more than 100, you must either include a machine-readable
|
---|
4476 | Transparent copy along with each Opaque copy, or state in or with
|
---|
4477 | each Opaque copy a computer-network location from which the general
|
---|
4478 | network-using public has access to download using public-standard
|
---|
4479 | network protocols a complete Transparent copy of the Document, free
|
---|
4480 | of added material. If you use the latter option, you must take
|
---|
4481 | reasonably prudent steps, when you begin distribution of Opaque
|
---|
4482 | copies in quantity, to ensure that this Transparent copy will
|
---|
4483 | remain thus accessible at the stated location until at least one
|
---|
4484 | year after the last time you distribute an Opaque copy (directly or
|
---|
4485 | through your agents or retailers) of that edition to the public.
|
---|
4486 |
|
---|
4487 | It is requested, but not required, that you contact the authors of
|
---|
4488 | the Document well before redistributing any large number of copies,
|
---|
4489 | to give them a chance to provide you with an updated version of the
|
---|
4490 | Document.
|
---|
4491 |
|
---|
4492 | 4. MODIFICATIONS
|
---|
4493 |
|
---|
4494 | You may copy and distribute a Modified Version of the Document
|
---|
4495 | under the conditions of sections 2 and 3 above, provided that you
|
---|
4496 | release the Modified Version under precisely this License, with the
|
---|
4497 | Modified Version filling the role of the Document, thus licensing
|
---|
4498 | distribution and modification of the Modified Version to whoever
|
---|
4499 | possesses a copy of it. In addition, you must do these things in
|
---|
4500 | the Modified Version:
|
---|
4501 |
|
---|
4502 | A. Use in the Title Page (and on the covers, if any) a title
|
---|
4503 | distinct from that of the Document, and from those of previous
|
---|
4504 | versions (which should, if there were any, be listed in the
|
---|
4505 | History section of the Document). You may use the same title
|
---|
4506 | as a previous version if the original publisher of that
|
---|
4507 | version gives permission.
|
---|
4508 |
|
---|
4509 | B. List on the Title Page, as authors, one or more persons or
|
---|
4510 | entities responsible for authorship of the modifications in
|
---|
4511 | the Modified Version, together with at least five of the
|
---|
4512 | principal authors of the Document (all of its principal
|
---|
4513 | authors, if it has fewer than five), unless they release you
|
---|
4514 | from this requirement.
|
---|
4515 |
|
---|
4516 | C. State on the Title page the name of the publisher of the
|
---|
4517 | Modified Version, as the publisher.
|
---|
4518 |
|
---|
4519 | D. Preserve all the copyright notices of the Document.
|
---|
4520 |
|
---|
4521 | E. Add an appropriate copyright notice for your modifications
|
---|
4522 | adjacent to the other copyright notices.
|
---|
4523 |
|
---|
4524 | F. Include, immediately after the copyright notices, a license
|
---|
4525 | notice giving the public permission to use the Modified
|
---|
4526 | Version under the terms of this License, in the form shown in
|
---|
4527 | the Addendum below.
|
---|
4528 |
|
---|
4529 | G. Preserve in that license notice the full lists of Invariant
|
---|
4530 | Sections and required Cover Texts given in the Documentâs
|
---|
4531 | license notice.
|
---|
4532 |
|
---|
4533 | H. Include an unaltered copy of this License.
|
---|
4534 |
|
---|
4535 | I. Preserve the section Entitled âHistoryâ, Preserve its Title,
|
---|
4536 | and add to it an item stating at least the title, year, new
|
---|
4537 | authors, and publisher of the Modified Version as given on the
|
---|
4538 | Title Page. If there is no section Entitled âHistoryâ in the
|
---|
4539 | Document, create one stating the title, year, authors, and
|
---|
4540 | publisher of the Document as given on its Title Page, then add
|
---|
4541 | an item describing the Modified Version as stated in the
|
---|
4542 | previous sentence.
|
---|
4543 |
|
---|
4544 | J. Preserve the network location, if any, given in the Document
|
---|
4545 | for public access to a Transparent copy of the Document, and
|
---|
4546 | likewise the network locations given in the Document for
|
---|
4547 | previous versions it was based on. These may be placed in the
|
---|
4548 | âHistoryâ section. You may omit a network location for a work
|
---|
4549 | that was published at least four years before the Document
|
---|
4550 | itself, or if the original publisher of the version it refers
|
---|
4551 | to gives permission.
|
---|
4552 |
|
---|
4553 | K. For any section Entitled âAcknowledgementsâ or âDedicationsâ,
|
---|
4554 | Preserve the Title of the section, and preserve in the section
|
---|
4555 | all the substance and tone of each of the contributor
|
---|
4556 | acknowledgements and/or dedications given therein.
|
---|
4557 |
|
---|
4558 | L. Preserve all the Invariant Sections of the Document, unaltered
|
---|
4559 | in their text and in their titles. Section numbers or the
|
---|
4560 | equivalent are not considered part of the section titles.
|
---|
4561 |
|
---|
4562 | M. Delete any section Entitled âEndorsementsâ. Such a section
|
---|
4563 | may not be included in the Modified Version.
|
---|
4564 |
|
---|
4565 | N. Do not retitle any existing section to be Entitled
|
---|
4566 | âEndorsementsâ or to conflict in title with any Invariant
|
---|
4567 | Section.
|
---|
4568 |
|
---|
4569 | O. Preserve any Warranty Disclaimers.
|
---|
4570 |
|
---|
4571 | If the Modified Version includes new front-matter sections or
|
---|
4572 | appendices that qualify as Secondary Sections and contain no
|
---|
4573 | material copied from the Document, you may at your option designate
|
---|
4574 | some or all of these sections as invariant. To do this, add their
|
---|
4575 | titles to the list of Invariant Sections in the Modified Versionâs
|
---|
4576 | license notice. These titles must be distinct from any other
|
---|
4577 | section titles.
|
---|
4578 |
|
---|
4579 | You may add a section Entitled âEndorsementsâ, provided it contains
|
---|
4580 | nothing but endorsements of your Modified Version by various
|
---|
4581 | partiesâfor example, statements of peer review or that the text has
|
---|
4582 | been approved by an organization as the authoritative definition of
|
---|
4583 | a standard.
|
---|
4584 |
|
---|
4585 | You may add a passage of up to five words as a Front-Cover Text,
|
---|
4586 | and a passage of up to 25 words as a Back-Cover Text, to the end of
|
---|
4587 | the list of Cover Texts in the Modified Version. Only one passage
|
---|
4588 | of Front-Cover Text and one of Back-Cover Text may be added by (or
|
---|
4589 | through arrangements made by) any one entity. If the Document
|
---|
4590 | already includes a cover text for the same cover, previously added
|
---|
4591 | by you or by arrangement made by the same entity you are acting on
|
---|
4592 | behalf of, you may not add another; but you may replace the old
|
---|
4593 | one, on explicit permission from the previous publisher that added
|
---|
4594 | the old one.
|
---|
4595 |
|
---|
4596 | The author(s) and publisher(s) of the Document do not by this
|
---|
4597 | License give permission to use their names for publicity for or to
|
---|
4598 | assert or imply endorsement of any Modified Version.
|
---|
4599 |
|
---|
4600 | 5. COMBINING DOCUMENTS
|
---|
4601 |
|
---|
4602 | You may combine the Document with other documents released under
|
---|
4603 | this License, under the terms defined in section 4 above for
|
---|
4604 | modified versions, provided that you include in the combination all
|
---|
4605 | of the Invariant Sections of all of the original documents,
|
---|
4606 | unmodified, and list them all as Invariant Sections of your
|
---|
4607 | combined work in its license notice, and that you preserve all
|
---|
4608 | their Warranty Disclaimers.
|
---|
4609 |
|
---|
4610 | The combined work need only contain one copy of this License, and
|
---|
4611 | multiple identical Invariant Sections may be replaced with a single
|
---|
4612 | copy. If there are multiple Invariant Sections with the same name
|
---|
4613 | but different contents, make the title of each such section unique
|
---|
4614 | by adding at the end of it, in parentheses, the name of the
|
---|
4615 | original author or publisher of that section if known, or else a
|
---|
4616 | unique number. Make the same adjustment to the section titles in
|
---|
4617 | the list of Invariant Sections in the license notice of the
|
---|
4618 | combined work.
|
---|
4619 |
|
---|
4620 | In the combination, you must combine any sections Entitled
|
---|
4621 | âHistoryâ in the various original documents, forming one section
|
---|
4622 | Entitled âHistoryâ; likewise combine any sections Entitled
|
---|
4623 | âAcknowledgementsâ, and any sections Entitled âDedicationsâ. You
|
---|
4624 | must delete all sections Entitled âEndorsements.â
|
---|
4625 |
|
---|
4626 | 6. COLLECTIONS OF DOCUMENTS
|
---|
4627 |
|
---|
4628 | You may make a collection consisting of the Document and other
|
---|
4629 | documents released under this License, and replace the individual
|
---|
4630 | copies of this License in the various documents with a single copy
|
---|
4631 | that is included in the collection, provided that you follow the
|
---|
4632 | rules of this License for verbatim copying of each of the documents
|
---|
4633 | in all other respects.
|
---|
4634 |
|
---|
4635 | You may extract a single document from such a collection, and
|
---|
4636 | distribute it individually under this License, provided you insert
|
---|
4637 | a copy of this License into the extracted document, and follow this
|
---|
4638 | License in all other respects regarding verbatim copying of that
|
---|
4639 | document.
|
---|
4640 |
|
---|
4641 | 7. AGGREGATION WITH INDEPENDENT WORKS
|
---|
4642 |
|
---|
4643 | A compilation of the Document or its derivatives with other
|
---|
4644 | separate and independent documents or works, in or on a volume of a
|
---|
4645 | storage or distribution medium, is called an âaggregateâ if the
|
---|
4646 | copyright resulting from the compilation is not used to limit the
|
---|
4647 | legal rights of the compilationâs users beyond what the individual
|
---|
4648 | works permit. When the Document is included in an aggregate, this
|
---|
4649 | License does not apply to the other works in the aggregate which
|
---|
4650 | are not themselves derivative works of the Document.
|
---|
4651 |
|
---|
4652 | If the Cover Text requirement of section 3 is applicable to these
|
---|
4653 | copies of the Document, then if the Document is less than one half
|
---|
4654 | of the entire aggregate, the Documentâs Cover Texts may be placed
|
---|
4655 | on covers that bracket the Document within the aggregate, or the
|
---|
4656 | electronic equivalent of covers if the Document is in electronic
|
---|
4657 | form. Otherwise they must appear on printed covers that bracket
|
---|
4658 | the whole aggregate.
|
---|
4659 |
|
---|
4660 | 8. TRANSLATION
|
---|
4661 |
|
---|
4662 | Translation is considered a kind of modification, so you may
|
---|
4663 | distribute translations of the Document under the terms of section
|
---|
4664 | 4. Replacing Invariant Sections with translations requires special
|
---|
4665 | permission from their copyright holders, but you may include
|
---|
4666 | translations of some or all Invariant Sections in addition to the
|
---|
4667 | original versions of these Invariant Sections. You may include a
|
---|
4668 | translation of this License, and all the license notices in the
|
---|
4669 | Document, and any Warranty Disclaimers, provided that you also
|
---|
4670 | include the original English version of this License and the
|
---|
4671 | original versions of those notices and disclaimers. In case of a
|
---|
4672 | disagreement between the translation and the original version of
|
---|
4673 | this License or a notice or disclaimer, the original version will
|
---|
4674 | prevail.
|
---|
4675 |
|
---|
4676 | If a section in the Document is Entitled âAcknowledgementsâ,
|
---|
4677 | âDedicationsâ, or âHistoryâ, the requirement (section 4) to
|
---|
4678 | Preserve its Title (section 1) will typically require changing the
|
---|
4679 | actual title.
|
---|
4680 |
|
---|
4681 | 9. TERMINATION
|
---|
4682 |
|
---|
4683 | You may not copy, modify, sublicense, or distribute the Document
|
---|
4684 | except as expressly provided under this License. Any attempt
|
---|
4685 | otherwise to copy, modify, sublicense, or distribute it is void,
|
---|
4686 | and will automatically terminate your rights under this License.
|
---|
4687 |
|
---|
4688 | However, if you cease all violation of this License, then your
|
---|
4689 | license from a particular copyright holder is reinstated (a)
|
---|
4690 | provisionally, unless and until the copyright holder explicitly and
|
---|
4691 | finally terminates your license, and (b) permanently, if the
|
---|
4692 | copyright holder fails to notify you of the violation by some
|
---|
4693 | reasonable means prior to 60 days after the cessation.
|
---|
4694 |
|
---|
4695 | Moreover, your license from a particular copyright holder is
|
---|
4696 | reinstated permanently if the copyright holder notifies you of the
|
---|
4697 | violation by some reasonable means, this is the first time you have
|
---|
4698 | received notice of violation of this License (for any work) from
|
---|
4699 | that copyright holder, and you cure the violation prior to 30 days
|
---|
4700 | after your receipt of the notice.
|
---|
4701 |
|
---|
4702 | Termination of your rights under this section does not terminate
|
---|
4703 | the licenses of parties who have received copies or rights from you
|
---|
4704 | under this License. If your rights have been terminated and not
|
---|
4705 | permanently reinstated, receipt of a copy of some or all of the
|
---|
4706 | same material does not give you any rights to use it.
|
---|
4707 |
|
---|
4708 | 10. FUTURE REVISIONS OF THIS LICENSE
|
---|
4709 |
|
---|
4710 | The Free Software Foundation may publish new, revised versions of
|
---|
4711 | the GNU Free Documentation License from time to time. Such new
|
---|
4712 | versions will be similar in spirit to the present version, but may
|
---|
4713 | differ in detail to address new problems or concerns. See
|
---|
4714 | <https://www.gnu.org/licenses/>.
|
---|
4715 |
|
---|
4716 | Each version of the License is given a distinguishing version
|
---|
4717 | number. If the Document specifies that a particular numbered
|
---|
4718 | version of this License âor any later versionâ applies to it, you
|
---|
4719 | have the option of following the terms and conditions either of
|
---|
4720 | that specified version or of any later version that has been
|
---|
4721 | published (not as a draft) by the Free Software Foundation. If the
|
---|
4722 | Document does not specify a version number of this License, you may
|
---|
4723 | choose any version ever published (not as a draft) by the Free
|
---|
4724 | Software Foundation. If the Document specifies that a proxy can
|
---|
4725 | decide which future versions of this License can be used, that
|
---|
4726 | proxyâs public statement of acceptance of a version permanently
|
---|
4727 | authorizes you to choose that version for the Document.
|
---|
4728 |
|
---|
4729 | 11. RELICENSING
|
---|
4730 |
|
---|
4731 | âMassive Multiauthor Collaboration Siteâ (or âMMC Siteâ) means any
|
---|
4732 | World Wide Web server that publishes copyrightable works and also
|
---|
4733 | provides prominent facilities for anybody to edit those works. A
|
---|
4734 | public wiki that anybody can edit is an example of such a server.
|
---|
4735 | A âMassive Multiauthor Collaborationâ (or âMMCâ) contained in the
|
---|
4736 | site means any set of copyrightable works thus published on the MMC
|
---|
4737 | site.
|
---|
4738 |
|
---|
4739 | âCC-BY-SAâ means the Creative Commons Attribution-Share Alike 3.0
|
---|
4740 | license published by Creative Commons Corporation, a not-for-profit
|
---|
4741 | corporation with a principal place of business in San Francisco,
|
---|
4742 | California, as well as future copyleft versions of that license
|
---|
4743 | published by that same organization.
|
---|
4744 |
|
---|
4745 | âIncorporateâ means to publish or republish a Document, in whole or
|
---|
4746 | in part, as part of another Document.
|
---|
4747 |
|
---|
4748 | An MMC is âeligible for relicensingâ if it is licensed under this
|
---|
4749 | License, and if all works that were first published under this
|
---|
4750 | License somewhere other than this MMC, and subsequently
|
---|
4751 | incorporated in whole or in part into the MMC, (1) had no cover
|
---|
4752 | texts or invariant sections, and (2) were thus incorporated prior
|
---|
4753 | to November 1, 2008.
|
---|
4754 |
|
---|
4755 | The operator of an MMC Site may republish an MMC contained in the
|
---|
4756 | site under CC-BY-SA on the same site at any time before August 1,
|
---|
4757 | 2009, provided the MMC is eligible for relicensing.
|
---|
4758 |
|
---|
4759 | ADDENDUM: How to use this License for your documents
|
---|
4760 | ====================================================
|
---|
4761 |
|
---|
4762 | To use this License in a document you have written, include a copy of
|
---|
4763 | the License in the document and put the following copyright and license
|
---|
4764 | notices just after the title page:
|
---|
4765 |
|
---|
4766 | Copyright (C) YEAR YOUR NAME.
|
---|
4767 | Permission is granted to copy, distribute and/or modify this document
|
---|
4768 | under the terms of the GNU Free Documentation License, Version 1.3
|
---|
4769 | or any later version published by the Free Software Foundation;
|
---|
4770 | with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
|
---|
4771 | Texts. A copy of the license is included in the section entitled ``GNU
|
---|
4772 | Free Documentation License''.
|
---|
4773 |
|
---|
4774 | If you have Invariant Sections, Front-Cover Texts and Back-Cover
|
---|
4775 | Texts, replace the âwith...Texts.â line with this:
|
---|
4776 |
|
---|
4777 | with the Invariant Sections being LIST THEIR TITLES, with
|
---|
4778 | the Front-Cover Texts being LIST, and with the Back-Cover Texts
|
---|
4779 | being LIST.
|
---|
4780 |
|
---|
4781 | If you have Invariant Sections without Cover Texts, or some other
|
---|
4782 | combination of the three, merge those two alternatives to suit the
|
---|
4783 | situation.
|
---|
4784 |
|
---|
4785 | If your document contains nontrivial examples of program code, we
|
---|
4786 | recommend releasing these examples in parallel under your choice of free
|
---|
4787 | software license, such as the GNU General Public License, to permit
|
---|
4788 | their use in free software.
|
---|
4789 |
|
---|
4790 |
|
---|
4791 | File: sed.info, Node: Concept Index, Next: Command and Option Index, Prev: GNU Free Documentation License, Up: Top
|
---|
4792 |
|
---|
4793 | Concept Index
|
---|
4794 | *************
|
---|
4795 |
|
---|
4796 | This is a general index of all issues discussed in this manual, with the
|
---|
4797 | exception of the âsedâ commands and command-line options.
|
---|
4798 |
|
---|
4799 | [index]
|
---|
4800 | * Menu:
|
---|
4801 |
|
---|
4802 | * -e, example: Overview. (line 46)
|
---|
4803 | * -e, example <1>: sed script overview. (line 37)
|
---|
4804 | * âexpression, example: Overview. (line 46)
|
---|
4805 | * -f, example: Overview. (line 46)
|
---|
4806 | * -f, example <1>: sed script overview. (line 37)
|
---|
4807 | * âfile, example: Overview. (line 46)
|
---|
4808 | * -i, example: Overview. (line 26)
|
---|
4809 | * -n, example: Overview. (line 33)
|
---|
4810 | * -s, example: Overview. (line 40)
|
---|
4811 | * 0 address: Reporting Bugs. (line 114)
|
---|
4812 | * ;, command separator: sed script overview. (line 37)
|
---|
4813 | * a, and semicolons: sed script overview. (line 56)
|
---|
4814 | * Additional reading about sed: Other Resources. (line 13)
|
---|
4815 | * ADDR1,+N: Range Addresses. (line 31)
|
---|
4816 | * ADDR1,~N: Range Addresses. (line 31)
|
---|
4817 | * address range, example: sed script overview. (line 23)
|
---|
4818 | * Address, as a regular expression: Regexp Addresses. (line 13)
|
---|
4819 | * Address, last line: Numeric Addresses. (line 13)
|
---|
4820 | * Address, numeric: Numeric Addresses. (line 8)
|
---|
4821 | * addresses, excluding: Addresses overview. (line 33)
|
---|
4822 | * Addresses, in sed scripts: Numeric Addresses. (line 6)
|
---|
4823 | * addresses, negating: Addresses overview. (line 33)
|
---|
4824 | * addresses, numeric: Addresses overview. (line 6)
|
---|
4825 | * addresses, range: Addresses overview. (line 26)
|
---|
4826 | * addresses, regular expression: Addresses overview. (line 20)
|
---|
4827 | * addresses, syntax: sed script overview. (line 13)
|
---|
4828 | * alphabetic characters: Character Classes and Bracket Expressions.
|
---|
4829 | (line 49)
|
---|
4830 | * alphanumeric characters: Character Classes and Bracket Expressions.
|
---|
4831 | (line 44)
|
---|
4832 | * Append hold space to pattern space: Other Commands. (line 288)
|
---|
4833 | * Append next input line to pattern space: Other Commands. (line 261)
|
---|
4834 | * Append pattern space to hold space: Other Commands. (line 280)
|
---|
4835 | * Appending text after a line: Other Commands. (line 45)
|
---|
4836 | * b, joining lines with: Branching and flow control.
|
---|
4837 | (line 150)
|
---|
4838 | * b, versus t: Branching and flow control.
|
---|
4839 | (line 150)
|
---|
4840 | * back-reference: Back-references and Subexpressions.
|
---|
4841 | (line 6)
|
---|
4842 | * Backreferences, in regular expressions: The "s" Command. (line 18)
|
---|
4843 | * blank characters: Character Classes and Bracket Expressions.
|
---|
4844 | (line 54)
|
---|
4845 | * bracket expression: Character Classes and Bracket Expressions.
|
---|
4846 | (line 6)
|
---|
4847 | * Branch to a label, if s/// failed: Extended Commands. (line 63)
|
---|
4848 | * Branch to a label, if s/// succeeded: Programming Commands.
|
---|
4849 | (line 22)
|
---|
4850 | * Branch to a label, unconditionally: Programming Commands.
|
---|
4851 | (line 18)
|
---|
4852 | * branching and n, N: Branching and flow control.
|
---|
4853 | (line 105)
|
---|
4854 | * branching, infinite loop: Branching and flow control.
|
---|
4855 | (line 95)
|
---|
4856 | * branching, joining lines: Branching and flow control.
|
---|
4857 | (line 150)
|
---|
4858 | * Buffer spaces, pattern and hold: Execution Cycle. (line 6)
|
---|
4859 | * Bugs, reporting: Reporting Bugs. (line 6)
|
---|
4860 | * c, and semicolons: sed script overview. (line 56)
|
---|
4861 | * case insensitive, regular expression: Regexp Addresses. (line 47)
|
---|
4862 | * Case-insensitive matching: The "s" Command. (line 117)
|
---|
4863 | * Caveat â #n on first line: Common Commands. (line 20)
|
---|
4864 | * character class: Character Classes and Bracket Expressions.
|
---|
4865 | (line 6)
|
---|
4866 | * character classes: Character Classes and Bracket Expressions.
|
---|
4867 | (line 43)
|
---|
4868 | * classes of characters: Character Classes and Bracket Expressions.
|
---|
4869 | (line 43)
|
---|
4870 | * Command groups: Common Commands. (line 91)
|
---|
4871 | * Comments, in scripts: Common Commands. (line 12)
|
---|
4872 | * Conditional branch: Programming Commands.
|
---|
4873 | (line 22)
|
---|
4874 | * Conditional branch <1>: Extended Commands. (line 63)
|
---|
4875 | * control characters: Character Classes and Bracket Expressions.
|
---|
4876 | (line 57)
|
---|
4877 | * Copy hold space into pattern space: Other Commands. (line 284)
|
---|
4878 | * Copy pattern space into hold space: Other Commands. (line 276)
|
---|
4879 | * cycle, restarting: Branching and flow control.
|
---|
4880 | (line 75)
|
---|
4881 | * d, example: sed script overview. (line 23)
|
---|
4882 | * Delete first line from pattern space: Other Commands. (line 255)
|
---|
4883 | * digit characters: Character Classes and Bracket Expressions.
|
---|
4884 | (line 62)
|
---|
4885 | * Disabling autoprint, from command line: Command-Line Options.
|
---|
4886 | (line 23)
|
---|
4887 | * empty regular expression: Regexp Addresses. (line 22)
|
---|
4888 | * Emptying pattern space: Extended Commands. (line 85)
|
---|
4889 | * Emptying pattern space <1>: Reporting Bugs. (line 143)
|
---|
4890 | * Evaluate Bourne-shell commands: Extended Commands. (line 12)
|
---|
4891 | * Evaluate Bourne-shell commands, after substitution: The "s" Command.
|
---|
4892 | (line 108)
|
---|
4893 | * example, address range: sed script overview. (line 23)
|
---|
4894 | * example, regular expression: sed script overview. (line 28)
|
---|
4895 | * Exchange hold space with pattern space: Other Commands. (line 292)
|
---|
4896 | * Excluding lines: Addresses overview. (line 33)
|
---|
4897 | * exit status: Exit status. (line 6)
|
---|
4898 | * exit status, example: Exit status. (line 25)
|
---|
4899 | * Extended regular expressions, choosing: Command-Line Options.
|
---|
4900 | (line 135)
|
---|
4901 | * Extended regular expressions, syntax: ERE syntax. (line 6)
|
---|
4902 | * File name, printing: Extended Commands. (line 30)
|
---|
4903 | * Files to be processed as input: Command-Line Options.
|
---|
4904 | (line 181)
|
---|
4905 | * Flow of control in scripts: Programming Commands.
|
---|
4906 | (line 11)
|
---|
4907 | * Global substitution: The "s" Command. (line 74)
|
---|
4908 | * GNU extensions, /dev/stderr file: The "s" Command. (line 101)
|
---|
4909 | * GNU extensions, /dev/stderr file <1>: Other Commands. (line 244)
|
---|
4910 | * GNU extensions, /dev/stdin file: Other Commands. (line 227)
|
---|
4911 | * GNU extensions, /dev/stdin file <1>: Extended Commands. (line 53)
|
---|
4912 | * GNU extensions, /dev/stdout file: Command-Line Options.
|
---|
4913 | (line 189)
|
---|
4914 | * GNU extensions, /dev/stdout file <1>: The "s" Command. (line 101)
|
---|
4915 | * GNU extensions, /dev/stdout file <2>: Other Commands. (line 244)
|
---|
4916 | * GNU extensions, 0 address: Range Addresses. (line 31)
|
---|
4917 | * GNU extensions, 0 address <1>: Reporting Bugs. (line 114)
|
---|
4918 | * GNU extensions, 0,ADDR2 addressing: Range Addresses. (line 31)
|
---|
4919 | * GNU extensions, ADDR1,+N addressing: Range Addresses. (line 31)
|
---|
4920 | * GNU extensions, ADDR1,~N addressing: Range Addresses. (line 31)
|
---|
4921 | * GNU extensions, branch if s/// failed: Extended Commands. (line 63)
|
---|
4922 | * GNU extensions, case modifiers in s commands: The "s" Command.
|
---|
4923 | (line 29)
|
---|
4924 | * GNU extensions, checking for their presence: Extended Commands.
|
---|
4925 | (line 69)
|
---|
4926 | * GNU extensions, debug: Command-Line Options.
|
---|
4927 | (line 29)
|
---|
4928 | * GNU extensions, disabling: Command-Line Options.
|
---|
4929 | (line 102)
|
---|
4930 | * GNU extensions, emptying pattern space: Extended Commands. (line 85)
|
---|
4931 | * GNU extensions, emptying pattern space <1>: Reporting Bugs. (line 143)
|
---|
4932 | * GNU extensions, evaluating Bourne-shell commands: The "s" Command.
|
---|
4933 | (line 108)
|
---|
4934 | * GNU extensions, evaluating Bourne-shell commands <1>: Extended Commands.
|
---|
4935 | (line 12)
|
---|
4936 | * GNU extensions, extended regular expressions: Command-Line Options.
|
---|
4937 | (line 135)
|
---|
4938 | * GNU extensions, g and NUMBER modifier: The "s" Command. (line 80)
|
---|
4939 | * GNU extensions, I modifier: The "s" Command. (line 117)
|
---|
4940 | * GNU extensions, I modifier <1>: Regexp Addresses. (line 47)
|
---|
4941 | * GNU extensions, in-place editing: Command-Line Options.
|
---|
4942 | (line 56)
|
---|
4943 | * GNU extensions, in-place editing <1>: Reporting Bugs. (line 95)
|
---|
4944 | * GNU extensions, M modifier: The "s" Command. (line 122)
|
---|
4945 | * GNU extensions, M modifier <1>: Regexp Addresses. (line 75)
|
---|
4946 | * GNU extensions, modifiers and the empty regular expression: Regexp Addresses.
|
---|
4947 | (line 22)
|
---|
4948 | * GNU extensions, N~M addresses: Numeric Addresses. (line 18)
|
---|
4949 | * GNU extensions, quitting silently: Extended Commands. (line 36)
|
---|
4950 | * GNU extensions, R command: Extended Commands. (line 53)
|
---|
4951 | * GNU extensions, reading a file a line at a time: Extended Commands.
|
---|
4952 | (line 53)
|
---|
4953 | * GNU extensions, returning an exit code: Common Commands. (line 28)
|
---|
4954 | * GNU extensions, returning an exit code <1>: Extended Commands.
|
---|
4955 | (line 36)
|
---|
4956 | * GNU extensions, setting line length: Other Commands. (line 207)
|
---|
4957 | * GNU extensions, special escapes: Escapes. (line 6)
|
---|
4958 | * GNU extensions, special escapes <1>: Reporting Bugs. (line 88)
|
---|
4959 | * GNU extensions, special two-address forms: Range Addresses. (line 31)
|
---|
4960 | * GNU extensions, subprocesses: The "s" Command. (line 108)
|
---|
4961 | * GNU extensions, subprocesses <1>: Extended Commands. (line 12)
|
---|
4962 | * GNU extensions, to basic regular expressions: BRE syntax. (line 13)
|
---|
4963 | * GNU extensions, to basic regular expressions <1>: BRE syntax.
|
---|
4964 | (line 59)
|
---|
4965 | * GNU extensions, to basic regular expressions <2>: BRE syntax.
|
---|
4966 | (line 62)
|
---|
4967 | * GNU extensions, to basic regular expressions <3>: BRE syntax.
|
---|
4968 | (line 77)
|
---|
4969 | * GNU extensions, to basic regular expressions <4>: BRE syntax.
|
---|
4970 | (line 87)
|
---|
4971 | * GNU extensions, to basic regular expressions <5>: Reporting Bugs.
|
---|
4972 | (line 61)
|
---|
4973 | * GNU extensions, two addresses supported by most commands: Other Commands.
|
---|
4974 | (line 61)
|
---|
4975 | * GNU extensions, two addresses supported by most commands <1>: Other Commands.
|
---|
4976 | (line 115)
|
---|
4977 | * GNU extensions, two addresses supported by most commands <2>: Other Commands.
|
---|
4978 | (line 204)
|
---|
4979 | * GNU extensions, two addresses supported by most commands <3>: Other Commands.
|
---|
4980 | (line 236)
|
---|
4981 | * GNU extensions, unlimited line length: Limitations. (line 6)
|
---|
4982 | * GNU extensions, writing first line to a file: Extended Commands.
|
---|
4983 | (line 80)
|
---|
4984 | * Goto, in scripts: Programming Commands.
|
---|
4985 | (line 18)
|
---|
4986 | * graphic characters: Character Classes and Bracket Expressions.
|
---|
4987 | (line 65)
|
---|
4988 | * Greedy regular expression matching: BRE syntax. (line 113)
|
---|
4989 | * Grouping commands: Common Commands. (line 91)
|
---|
4990 | * hexadecimal digits: Character Classes and Bracket Expressions.
|
---|
4991 | (line 88)
|
---|
4992 | * Hold space, appending from pattern space: Other Commands. (line 280)
|
---|
4993 | * Hold space, appending to pattern space: Other Commands. (line 288)
|
---|
4994 | * Hold space, copy into pattern space: Other Commands. (line 284)
|
---|
4995 | * Hold space, copying pattern space into: Other Commands. (line 276)
|
---|
4996 | * Hold space, definition: Execution Cycle. (line 6)
|
---|
4997 | * Hold space, exchange with pattern space: Other Commands. (line 292)
|
---|
4998 | * i, and semicolons: sed script overview. (line 56)
|
---|
4999 | * In-place editing: Reporting Bugs. (line 95)
|
---|
5000 | * In-place editing, activating: Command-Line Options.
|
---|
5001 | (line 56)
|
---|
5002 | * In-place editing, Perl-style backup file names: Command-Line Options.
|
---|
5003 | (line 67)
|
---|
5004 | * infinite loop, branching: Branching and flow control.
|
---|
5005 | (line 95)
|
---|
5006 | * Inserting text before a line: Other Commands. (line 104)
|
---|
5007 | * joining lines with branching: Branching and flow control.
|
---|
5008 | (line 150)
|
---|
5009 | * joining quoted-printable lines: Branching and flow control.
|
---|
5010 | (line 150)
|
---|
5011 | * labels: Branching and flow control.
|
---|
5012 | (line 75)
|
---|
5013 | * Labels, in scripts: Programming Commands.
|
---|
5014 | (line 14)
|
---|
5015 | * Last line, selecting: Numeric Addresses. (line 13)
|
---|
5016 | * Line length, setting: Command-Line Options.
|
---|
5017 | (line 97)
|
---|
5018 | * Line length, setting <1>: Other Commands. (line 207)
|
---|
5019 | * Line number, printing: Other Commands. (line 194)
|
---|
5020 | * Line selection: Numeric Addresses. (line 6)
|
---|
5021 | * Line, selecting by number: Numeric Addresses. (line 8)
|
---|
5022 | * Line, selecting by regular expression match: Regexp Addresses.
|
---|
5023 | (line 13)
|
---|
5024 | * Line, selecting last: Numeric Addresses. (line 13)
|
---|
5025 | * List pattern space: Other Commands. (line 207)
|
---|
5026 | * lower-case letters: Character Classes and Bracket Expressions.
|
---|
5027 | (line 68)
|
---|
5028 | * Mixing g and NUMBER modifiers in the s command: The "s" Command.
|
---|
5029 | (line 80)
|
---|
5030 | * multiple files: Overview. (line 40)
|
---|
5031 | * multiple sed commands: sed script overview. (line 37)
|
---|
5032 | * n, and branching: Branching and flow control.
|
---|
5033 | (line 105)
|
---|
5034 | * N, and branching: Branching and flow control.
|
---|
5035 | (line 105)
|
---|
5036 | * named character classes: Character Classes and Bracket Expressions.
|
---|
5037 | (line 43)
|
---|
5038 | * newline, command separator: sed script overview. (line 37)
|
---|
5039 | * Next input line, append to pattern space: Other Commands. (line 261)
|
---|
5040 | * Next input line, replace pattern space with: Common Commands.
|
---|
5041 | (line 61)
|
---|
5042 | * Non-bugs, 0 address: Reporting Bugs. (line 114)
|
---|
5043 | * Non-bugs, in-place editing: Reporting Bugs. (line 95)
|
---|
5044 | * Non-bugs, localization-related: Reporting Bugs. (line 124)
|
---|
5045 | * Non-bugs, localization-related <1>: Reporting Bugs. (line 143)
|
---|
5046 | * Non-bugs, N command on the last line: Reporting Bugs. (line 30)
|
---|
5047 | * Non-bugs, regex syntax clashes: Reporting Bugs. (line 61)
|
---|
5048 | * numeric addresses: Addresses overview. (line 6)
|
---|
5049 | * numeric characters: Character Classes and Bracket Expressions.
|
---|
5050 | (line 62)
|
---|
5051 | * omitting labels: Branching and flow control.
|
---|
5052 | (line 75)
|
---|
5053 | * output: Overview. (line 26)
|
---|
5054 | * output, suppressing: Overview. (line 33)
|
---|
5055 | * p, example: Overview. (line 33)
|
---|
5056 | * paragraphs, processing: Multiline techniques.
|
---|
5057 | (line 53)
|
---|
5058 | * parameters, script: Overview. (line 46)
|
---|
5059 | * Parenthesized substrings: The "s" Command. (line 18)
|
---|
5060 | * Pattern space, definition: Execution Cycle. (line 6)
|
---|
5061 | * Portability, comments: Common Commands. (line 15)
|
---|
5062 | * Portability, line length limitations: Limitations. (line 6)
|
---|
5063 | * Portability, N command on the last line: Reporting Bugs. (line 30)
|
---|
5064 | * POSIXLY_CORRECT behavior, bracket expressions: Character Classes and Bracket Expressions.
|
---|
5065 | (line 112)
|
---|
5066 | * POSIXLY_CORRECT behavior, enabling: Command-Line Options.
|
---|
5067 | (line 105)
|
---|
5068 | * POSIXLY_CORRECT behavior, escapes: Escapes. (line 11)
|
---|
5069 | * POSIXLY_CORRECT behavior, N command: Reporting Bugs. (line 56)
|
---|
5070 | * Print first line from pattern space: Other Commands. (line 273)
|
---|
5071 | * printable characters: Character Classes and Bracket Expressions.
|
---|
5072 | (line 72)
|
---|
5073 | * Printing file name: Extended Commands. (line 30)
|
---|
5074 | * Printing line number: Other Commands. (line 194)
|
---|
5075 | * Printing text unambiguously: Other Commands. (line 207)
|
---|
5076 | * processing paragraphs: Multiline techniques.
|
---|
5077 | (line 53)
|
---|
5078 | * punctuation characters: Character Classes and Bracket Expressions.
|
---|
5079 | (line 75)
|
---|
5080 | * Q, example: Exit status. (line 25)
|
---|
5081 | * q, example: sed script overview. (line 28)
|
---|
5082 | * Quitting: Common Commands. (line 28)
|
---|
5083 | * Quitting <1>: Extended Commands. (line 36)
|
---|
5084 | * quoted-printable lines, joining: Branching and flow control.
|
---|
5085 | (line 150)
|
---|
5086 | * range addresses: Addresses overview. (line 26)
|
---|
5087 | * range expression: Character Classes and Bracket Expressions.
|
---|
5088 | (line 18)
|
---|
5089 | * Range of lines: Range Addresses. (line 6)
|
---|
5090 | * Range with start address of zero: Range Addresses. (line 31)
|
---|
5091 | * Read next input line: Common Commands. (line 61)
|
---|
5092 | * Read text from a file: Other Commands. (line 219)
|
---|
5093 | * Read text from a file <1>: Extended Commands. (line 53)
|
---|
5094 | * regex addresses and input lines: Regexp Addresses. (line 84)
|
---|
5095 | * regex addresses and pattern space: Regexp Addresses. (line 84)
|
---|
5096 | * regular expression addresses: Addresses overview. (line 20)
|
---|
5097 | * regular expression, example: sed script overview. (line 28)
|
---|
5098 | * Replace hold space with copy of pattern space: Other Commands.
|
---|
5099 | (line 276)
|
---|
5100 | * Replace pattern space with copy of hold space: Other Commands.
|
---|
5101 | (line 284)
|
---|
5102 | * Replacing all text matching regexp in a line: The "s" Command.
|
---|
5103 | (line 74)
|
---|
5104 | * Replacing only Nth match of regexp in a line: The "s" Command.
|
---|
5105 | (line 78)
|
---|
5106 | * Replacing selected lines with other text: Other Commands. (line 157)
|
---|
5107 | * Requiring GNU sed: Extended Commands. (line 69)
|
---|
5108 | * restarting a cycle: Branching and flow control.
|
---|
5109 | (line 75)
|
---|
5110 | * Sandbox mode: Command-Line Options.
|
---|
5111 | (line 157)
|
---|
5112 | * script parameter: Overview. (line 46)
|
---|
5113 | * Script structure: sed script overview. (line 6)
|
---|
5114 | * Script, from a file: Command-Line Options.
|
---|
5115 | (line 51)
|
---|
5116 | * Script, from command line: Command-Line Options.
|
---|
5117 | (line 46)
|
---|
5118 | * sed commands syntax: sed script overview. (line 13)
|
---|
5119 | * sed commands, multiple: sed script overview. (line 37)
|
---|
5120 | * sed script structure: sed script overview. (line 6)
|
---|
5121 | * Selecting lines to process: Numeric Addresses. (line 6)
|
---|
5122 | * Selecting non-matching lines: Addresses overview. (line 33)
|
---|
5123 | * semicolons, command separator: sed script overview. (line 37)
|
---|
5124 | * Several lines, selecting: Range Addresses. (line 6)
|
---|
5125 | * Slash character, in regular expressions: Regexp Addresses. (line 32)
|
---|
5126 | * space characters: Character Classes and Bracket Expressions.
|
---|
5127 | (line 80)
|
---|
5128 | * Spaces, pattern and hold: Execution Cycle. (line 6)
|
---|
5129 | * Special addressing forms: Range Addresses. (line 31)
|
---|
5130 | * standard input: Overview. (line 18)
|
---|
5131 | * Standard input, processing as input: Command-Line Options.
|
---|
5132 | (line 183)
|
---|
5133 | * standard output: Overview. (line 26)
|
---|
5134 | * stdin: Overview. (line 18)
|
---|
5135 | * stdout: Overview. (line 26)
|
---|
5136 | * Stream editor: Introduction. (line 6)
|
---|
5137 | * subexpression: Back-references and Subexpressions.
|
---|
5138 | (line 6)
|
---|
5139 | * Subprocesses: The "s" Command. (line 108)
|
---|
5140 | * Subprocesses <1>: Extended Commands. (line 12)
|
---|
5141 | * Substitution of text, options: The "s" Command. (line 70)
|
---|
5142 | * suppressing output: Overview. (line 33)
|
---|
5143 | * syntax, addresses: sed script overview. (line 13)
|
---|
5144 | * syntax, sed commands: sed script overview. (line 13)
|
---|
5145 | * t, joining lines with: Branching and flow control.
|
---|
5146 | (line 150)
|
---|
5147 | * t, versus b: Branching and flow control.
|
---|
5148 | (line 150)
|
---|
5149 | * Text, appending: Other Commands. (line 45)
|
---|
5150 | * Text, deleting: Common Commands. (line 44)
|
---|
5151 | * Text, insertion: Other Commands. (line 104)
|
---|
5152 | * Text, printing: Common Commands. (line 52)
|
---|
5153 | * Text, printing after substitution: The "s" Command. (line 88)
|
---|
5154 | * Text, writing to a file after substitution: The "s" Command.
|
---|
5155 | (line 101)
|
---|
5156 | * Transliteration: Other Commands. (line 11)
|
---|
5157 | * Unbuffered I/O, choosing: Command-Line Options.
|
---|
5158 | (line 164)
|
---|
5159 | * upper-case letters: Character Classes and Bracket Expressions.
|
---|
5160 | (line 84)
|
---|
5161 | * Usage summary, printing: Command-Line Options.
|
---|
5162 | (line 17)
|
---|
5163 | * Version, printing: Command-Line Options.
|
---|
5164 | (line 13)
|
---|
5165 | * whitespace characters: Character Classes and Bracket Expressions.
|
---|
5166 | (line 80)
|
---|
5167 | * Working on separate files: Command-Line Options.
|
---|
5168 | (line 148)
|
---|
5169 | * Write first line to a file: Extended Commands. (line 80)
|
---|
5170 | * Write to a file: Other Commands. (line 244)
|
---|
5171 | * xdigit class: Character Classes and Bracket Expressions.
|
---|
5172 | (line 88)
|
---|
5173 | * Zero Address: Zero Address. (line 6)
|
---|
5174 | * Zero, as range start address: Range Addresses. (line 31)
|
---|
5175 |
|
---|
5176 |
|
---|
5177 | File: sed.info, Node: Command and Option Index, Prev: Concept Index, Up: Top
|
---|
5178 |
|
---|
5179 | Command and Option Index
|
---|
5180 | ************************
|
---|
5181 |
|
---|
5182 | This is an alphabetical list of all âsedâ commands and command-line
|
---|
5183 | options.
|
---|
5184 |
|
---|
5185 | [index]
|
---|
5186 | * Menu:
|
---|
5187 |
|
---|
5188 | * # (comments): Common Commands. (line 12)
|
---|
5189 | * --binary: Command-Line Options.
|
---|
5190 | (line 114)
|
---|
5191 | * --debug: Command-Line Options.
|
---|
5192 | (line 29)
|
---|
5193 | * --expression: Command-Line Options.
|
---|
5194 | (line 46)
|
---|
5195 | * --file: Command-Line Options.
|
---|
5196 | (line 51)
|
---|
5197 | * --follow-symlinks: Command-Line Options.
|
---|
5198 | (line 125)
|
---|
5199 | * --help: Command-Line Options.
|
---|
5200 | (line 17)
|
---|
5201 | * --in-place: Command-Line Options.
|
---|
5202 | (line 56)
|
---|
5203 | * --line-length: Command-Line Options.
|
---|
5204 | (line 97)
|
---|
5205 | * --null-data: Command-Line Options.
|
---|
5206 | (line 172)
|
---|
5207 | * --posix: Command-Line Options.
|
---|
5208 | (line 102)
|
---|
5209 | * --quiet: Command-Line Options.
|
---|
5210 | (line 23)
|
---|
5211 | * --regexp-extended: Command-Line Options.
|
---|
5212 | (line 135)
|
---|
5213 | * --sandbox: Command-Line Options.
|
---|
5214 | (line 157)
|
---|
5215 | * --separate: Command-Line Options.
|
---|
5216 | (line 148)
|
---|
5217 | * --silent: Command-Line Options.
|
---|
5218 | (line 23)
|
---|
5219 | * --unbuffered: Command-Line Options.
|
---|
5220 | (line 164)
|
---|
5221 | * --version: Command-Line Options.
|
---|
5222 | (line 13)
|
---|
5223 | * --zero-terminated: Command-Line Options.
|
---|
5224 | (line 172)
|
---|
5225 | * -b: Command-Line Options.
|
---|
5226 | (line 114)
|
---|
5227 | * -e: Command-Line Options.
|
---|
5228 | (line 46)
|
---|
5229 | * -E: Command-Line Options.
|
---|
5230 | (line 135)
|
---|
5231 | * -f: Command-Line Options.
|
---|
5232 | (line 51)
|
---|
5233 | * -i: Command-Line Options.
|
---|
5234 | (line 56)
|
---|
5235 | * -l: Command-Line Options.
|
---|
5236 | (line 97)
|
---|
5237 | * -n: Command-Line Options.
|
---|
5238 | (line 23)
|
---|
5239 | * -n, forcing from within a script: Common Commands. (line 20)
|
---|
5240 | * -r: Command-Line Options.
|
---|
5241 | (line 135)
|
---|
5242 | * -s: Command-Line Options.
|
---|
5243 | (line 148)
|
---|
5244 | * -u: Command-Line Options.
|
---|
5245 | (line 164)
|
---|
5246 | * -z: Command-Line Options.
|
---|
5247 | (line 172)
|
---|
5248 | * : (label) command: Programming Commands.
|
---|
5249 | (line 14)
|
---|
5250 | * = (print line number) command: Other Commands. (line 194)
|
---|
5251 | * {} command grouping: Common Commands. (line 91)
|
---|
5252 | * a (append text lines) command: Other Commands. (line 45)
|
---|
5253 | * alnum character class: Character Classes and Bracket Expressions.
|
---|
5254 | (line 44)
|
---|
5255 | * alpha character class: Character Classes and Bracket Expressions.
|
---|
5256 | (line 49)
|
---|
5257 | * b (branch) command: Programming Commands.
|
---|
5258 | (line 18)
|
---|
5259 | * blank character class: Character Classes and Bracket Expressions.
|
---|
5260 | (line 54)
|
---|
5261 | * c (change to text lines) command: Other Commands. (line 157)
|
---|
5262 | * cntrl character class: Character Classes and Bracket Expressions.
|
---|
5263 | (line 57)
|
---|
5264 | * D (delete first line) command: Other Commands. (line 255)
|
---|
5265 | * d (delete) command: Common Commands. (line 44)
|
---|
5266 | * digit character class: Character Classes and Bracket Expressions.
|
---|
5267 | (line 62)
|
---|
5268 | * e (evaluate) command: Extended Commands. (line 12)
|
---|
5269 | * F (File name) command: Extended Commands. (line 30)
|
---|
5270 | * G (appending Get) command: Other Commands. (line 288)
|
---|
5271 | * g (get) command: Other Commands. (line 284)
|
---|
5272 | * graph character class: Character Classes and Bracket Expressions.
|
---|
5273 | (line 65)
|
---|
5274 | * H (append Hold) command: Other Commands. (line 280)
|
---|
5275 | * h (hold) command: Other Commands. (line 276)
|
---|
5276 | * i (insert text lines) command: Other Commands. (line 104)
|
---|
5277 | * l (list unambiguously) command: Other Commands. (line 207)
|
---|
5278 | * lower character class: Character Classes and Bracket Expressions.
|
---|
5279 | (line 68)
|
---|
5280 | * N (append Next line) command: Other Commands. (line 261)
|
---|
5281 | * n (next-line) command: Common Commands. (line 61)
|
---|
5282 | * P (print first line) command: Other Commands. (line 273)
|
---|
5283 | * p (print) command: Common Commands. (line 52)
|
---|
5284 | * print character class: Character Classes and Bracket Expressions.
|
---|
5285 | (line 72)
|
---|
5286 | * punct character class: Character Classes and Bracket Expressions.
|
---|
5287 | (line 75)
|
---|
5288 | * q (quit) command: Common Commands. (line 28)
|
---|
5289 | * Q (silent Quit) command: Extended Commands. (line 36)
|
---|
5290 | * r (read file) command: Other Commands. (line 219)
|
---|
5291 | * R (read line) command: Extended Commands. (line 53)
|
---|
5292 | * s command, option flags: The "s" Command. (line 70)
|
---|
5293 | * space character class: Character Classes and Bracket Expressions.
|
---|
5294 | (line 80)
|
---|
5295 | * T (test and branch if failed) command: Extended Commands. (line 63)
|
---|
5296 | * t (test and branch if successful) command: Programming Commands.
|
---|
5297 | (line 22)
|
---|
5298 | * upper character class: Character Classes and Bracket Expressions.
|
---|
5299 | (line 84)
|
---|
5300 | * v (version) command: Extended Commands. (line 69)
|
---|
5301 | * w (write file) command: Other Commands. (line 244)
|
---|
5302 | * W (write first line) command: Extended Commands. (line 80)
|
---|
5303 | * x (eXchange) command: Other Commands. (line 292)
|
---|
5304 | * xdigit character class: Character Classes and Bracket Expressions.
|
---|
5305 | (line 88)
|
---|
5306 | * y (transliterate) command: Other Commands. (line 11)
|
---|
5307 | * z (Zap) command: Extended Commands. (line 85)
|
---|
5308 |
|
---|
5309 |
|
---|
5310 |
|
---|
5311 | Tag Table:
|
---|
5312 | Node: Top738
|
---|
5313 | Node: Introduction2217
|
---|
5314 | Node: Invoking sed2789
|
---|
5315 | Node: Overview3114
|
---|
5316 | Node: Command-Line Options5561
|
---|
5317 | Ref: Command-Line Options-Footnote-113530
|
---|
5318 | Ref: Command-Line Options-Footnote-213758
|
---|
5319 | Node: Exit status13861
|
---|
5320 | Node: sed scripts14795
|
---|
5321 | Node: sed script overview15394
|
---|
5322 | Node: sed commands list18057
|
---|
5323 | Node: The "s" Command23070
|
---|
5324 | Ref: The "s" Command-Footnote-128889
|
---|
5325 | Node: Common Commands28969
|
---|
5326 | Node: Other Commands32106
|
---|
5327 | Ref: insert command35324
|
---|
5328 | Ref: Other Commands-Footnote-141629
|
---|
5329 | Node: Programming Commands41709
|
---|
5330 | Node: Extended Commands42649
|
---|
5331 | Node: Multiple commands syntax46675
|
---|
5332 | Node: sed addresses51217
|
---|
5333 | Node: Addresses overview51706
|
---|
5334 | Node: Numeric Addresses53705
|
---|
5335 | Node: Regexp Addresses55116
|
---|
5336 | Ref: Regexp Addresses-Footnote-159252
|
---|
5337 | Node: Range Addresses59392
|
---|
5338 | Ref: Zero Address Regex Range60294
|
---|
5339 | Node: Zero Address61753
|
---|
5340 | Node: sed regular expressions62318
|
---|
5341 | Node: Regular Expressions Overview63172
|
---|
5342 | Node: BRE vs ERE64733
|
---|
5343 | Node: BRE syntax66484
|
---|
5344 | Node: ERE syntax73304
|
---|
5345 | Node: Character Classes and Bracket Expressions74878
|
---|
5346 | Node: regexp extensions80030
|
---|
5347 | Node: Back-references and Subexpressions82506
|
---|
5348 | Node: Escapes84958
|
---|
5349 | Ref: Escapes-Footnote-188105
|
---|
5350 | Node: Locale Considerations88304
|
---|
5351 | Ref: Locale Considerations-Footnote-193067
|
---|
5352 | Node: advanced sed93239
|
---|
5353 | Node: Execution Cycle93606
|
---|
5354 | Ref: Execution Cycle-Footnote-194845
|
---|
5355 | Node: Hold and Pattern Buffers95162
|
---|
5356 | Node: Multiline techniques95350
|
---|
5357 | Node: Branching and flow control98704
|
---|
5358 | Node: Examples107029
|
---|
5359 | Node: Joining lines108275
|
---|
5360 | Node: Centering lines110082
|
---|
5361 | Node: Increment a number111006
|
---|
5362 | Ref: Increment a number-Footnote-1112495
|
---|
5363 | Node: Rename files to lower case112623
|
---|
5364 | Node: Print bash environment115418
|
---|
5365 | Node: Reverse chars of lines116181
|
---|
5366 | Ref: Reverse chars of lines-Footnote-1117224
|
---|
5367 | Node: Text search across multiple lines117441
|
---|
5368 | Node: Line length adjustment120786
|
---|
5369 | Node: Adding a header to multiple files122533
|
---|
5370 | Node: tac125986
|
---|
5371 | Node: cat -n126774
|
---|
5372 | Node: cat -b128616
|
---|
5373 | Node: wc -c129378
|
---|
5374 | Ref: wc -c-Footnote-1131316
|
---|
5375 | Node: wc -w131385
|
---|
5376 | Node: wc -l132875
|
---|
5377 | Node: head133128
|
---|
5378 | Node: tail133467
|
---|
5379 | Node: uniq135196
|
---|
5380 | Node: uniq -d136007
|
---|
5381 | Node: uniq -u136722
|
---|
5382 | Node: cat -s137435
|
---|
5383 | Node: Limitations139298
|
---|
5384 | Node: Other Resources140161
|
---|
5385 | Node: Reporting Bugs141104
|
---|
5386 | Ref: N_command_last_line142294
|
---|
5387 | Ref: Reporting Bugs-Footnote-1148805
|
---|
5388 | Node: GNU Free Documentation License148880
|
---|
5389 | Node: Concept Index174239
|
---|
5390 | Node: Command and Option Index201600
|
---|
5391 |
|
---|
5392 | End Tag Table
|
---|
5393 |
|
---|
5394 |
|
---|
5395 | Local Variables:
|
---|
5396 | coding: utf-8
|
---|
5397 | End:
|
---|