| 1 | This is ../../doc/sed.info, produced by makeinfo version 4.5 from
|
|---|
| 2 | ../../doc/sed.texi.
|
|---|
| 3 |
|
|---|
| 4 | INFO-DIR-SECTION Text creation and manipulation
|
|---|
| 5 | START-INFO-DIR-ENTRY
|
|---|
| 6 | * sed: (sed). Stream EDitor.
|
|---|
| 7 |
|
|---|
| 8 | END-INFO-DIR-ENTRY
|
|---|
| 9 |
|
|---|
| 10 | This file documents version 4.1.5 of GNU `sed', a stream editor.
|
|---|
| 11 |
|
|---|
| 12 | Copyright (C) 1998, 1999, 2001, 2002, 2003, 2004 Free Software
|
|---|
| 13 | Foundation, Inc.
|
|---|
| 14 |
|
|---|
| 15 | This document is released under the terms of the GNU Free
|
|---|
| 16 | Documentation License as published by the Free Software Foundation;
|
|---|
| 17 | either version 1.1, or (at your option) any later version.
|
|---|
| 18 |
|
|---|
| 19 | You should have received a copy of the GNU Free Documentation
|
|---|
| 20 | License along with GNU `sed'; see the file `COPYING.DOC'. If not,
|
|---|
| 21 | write to the Free Software Foundation, 59 Temple Place - Suite 330,
|
|---|
| 22 | Boston, MA 02110-1301, USA.
|
|---|
| 23 |
|
|---|
| 24 | There are no Cover Texts and no Invariant Sections; this text, along
|
|---|
| 25 | with its equivalent in the printed manual, constitutes the Title Page.
|
|---|
| 26 |
|
|---|
| 27 | File: sed.info, Node: Top, Next: Introduction, Up: (dir)
|
|---|
| 28 |
|
|---|
| 29 |
|
|---|
| 30 |
|
|---|
| 31 | This file documents version 4.1.5 of GNU `sed', a stream editor.
|
|---|
| 32 |
|
|---|
| 33 | Copyright (C) 1998, 1999, 2001, 2002, 2003, 2004 Free Software
|
|---|
| 34 | Foundation, Inc.
|
|---|
| 35 |
|
|---|
| 36 | This document is released under the terms of the GNU Free
|
|---|
| 37 | Documentation License as published by the Free Software Foundation;
|
|---|
| 38 | either version 1.1, or (at your option) any later version.
|
|---|
| 39 |
|
|---|
| 40 | You should have received a copy of the GNU Free Documentation
|
|---|
| 41 | License along with GNU `sed'; see the file `COPYING.DOC'. If not,
|
|---|
| 42 | write to the Free Software Foundation, 59 Temple Place - Suite 330,
|
|---|
| 43 | Boston, MA 02110-1301, USA.
|
|---|
| 44 |
|
|---|
| 45 | There are no Cover Texts and no Invariant Sections; this text, along
|
|---|
| 46 | with its equivalent in the printed manual, constitutes the Title Page.
|
|---|
| 47 | * Menu:
|
|---|
| 48 |
|
|---|
| 49 | * Introduction:: Introduction
|
|---|
| 50 | * Invoking sed:: Invocation
|
|---|
| 51 | * sed Programs:: `sed' programs
|
|---|
| 52 | * Examples:: Some sample scripts
|
|---|
| 53 | * Limitations:: Limitations and (non-)limitations of GNU `sed'
|
|---|
| 54 | * Other Resources:: Other resources for learning about `sed'
|
|---|
| 55 | * Reporting Bugs:: Reporting bugs
|
|---|
| 56 |
|
|---|
| 57 | * Extended regexps:: `egrep'-style regular expressions
|
|---|
| 58 |
|
|---|
| 59 | * Concept Index:: A menu with all the topics in this manual.
|
|---|
| 60 | * Command and Option Index:: A menu with all `sed' commands and
|
|---|
| 61 | command-line options.
|
|---|
| 62 |
|
|---|
| 63 | --- The detailed node listing ---
|
|---|
| 64 |
|
|---|
| 65 | sed Programs:
|
|---|
| 66 | * Execution Cycle:: How `sed' works
|
|---|
| 67 | * Addresses:: Selecting lines with `sed'
|
|---|
| 68 | * Regular Expressions:: Overview of regular expression syntax
|
|---|
| 69 | * Common Commands:: Often used commands
|
|---|
| 70 | * The "s" Command:: `sed''s Swiss Army Knife
|
|---|
| 71 | * Other Commands:: Less frequently used commands
|
|---|
| 72 | * Programming Commands:: Commands for `sed' gurus
|
|---|
| 73 | * Extended Commands:: Commands specific of GNU `sed'
|
|---|
| 74 | * Escapes:: Specifying special characters
|
|---|
| 75 |
|
|---|
| 76 | Examples:
|
|---|
| 77 | * Centering lines::
|
|---|
| 78 | * Increment a number::
|
|---|
| 79 | * Rename files to lower case::
|
|---|
| 80 | * Print bash environment::
|
|---|
| 81 | * Reverse chars of lines::
|
|---|
| 82 | * tac:: Reverse lines of files
|
|---|
| 83 | * cat -n:: Numbering lines
|
|---|
| 84 | * cat -b:: Numbering non-blank lines
|
|---|
| 85 | * wc -c:: Counting chars
|
|---|
| 86 | * wc -w:: Counting words
|
|---|
| 87 | * wc -l:: Counting lines
|
|---|
| 88 | * head:: Printing the first lines
|
|---|
| 89 | * tail:: Printing the last lines
|
|---|
| 90 | * uniq:: Make duplicate lines unique
|
|---|
| 91 | * uniq -d:: Print duplicated lines of input
|
|---|
| 92 | * uniq -u:: Remove all duplicated lines
|
|---|
| 93 | * cat -s:: Squeezing blank lines
|
|---|
| 94 |
|
|---|
| 95 |
|
|---|
| 96 | File: sed.info, Node: Introduction, Next: Invoking sed, Prev: Top, Up: Top
|
|---|
| 97 |
|
|---|
| 98 | Introduction
|
|---|
| 99 | ************
|
|---|
| 100 |
|
|---|
| 101 | `sed' is a stream editor. A stream editor is used to perform basic
|
|---|
| 102 | text transformations on an input stream (a file or input from a
|
|---|
| 103 | pipeline). While in some ways similar to an editor which permits
|
|---|
| 104 | scripted edits (such as `ed'), `sed' works by making only one pass over
|
|---|
| 105 | the input(s), and is consequently more efficient. But it is `sed''s
|
|---|
| 106 | ability to filter text in a pipeline which particularly distinguishes
|
|---|
| 107 | it from other types of editors.
|
|---|
| 108 |
|
|---|
| 109 |
|
|---|
| 110 | File: sed.info, Node: Invoking sed, Next: sed Programs, Prev: Introduction, Up: Top
|
|---|
| 111 |
|
|---|
| 112 | Invocation
|
|---|
| 113 | **********
|
|---|
| 114 |
|
|---|
| 115 | Normally `sed' is invoked like this:
|
|---|
| 116 |
|
|---|
| 117 | sed SCRIPT INPUTFILE...
|
|---|
| 118 |
|
|---|
| 119 | The full format for invoking `sed' is:
|
|---|
| 120 |
|
|---|
| 121 | sed OPTIONS... [SCRIPT] [INPUTFILE...]
|
|---|
| 122 |
|
|---|
| 123 | If you do not specify INPUTFILE, or if INPUTFILE is `-', `sed'
|
|---|
| 124 | filters the contents of the standard input. The SCRIPT is actually the
|
|---|
| 125 | first non-option parameter, which `sed' specially considers a script
|
|---|
| 126 | and not an input file if (and only if) none of the other OPTIONS
|
|---|
| 127 | specifies a script to be executed, that is if neither of the `-e' and
|
|---|
| 128 | `-f' options is specified.
|
|---|
| 129 |
|
|---|
| 130 | `sed' may be invoked with the following command-line options:
|
|---|
| 131 |
|
|---|
| 132 | `--version'
|
|---|
| 133 | Print out the version of `sed' that is being run and a copyright
|
|---|
| 134 | notice, then exit.
|
|---|
| 135 |
|
|---|
| 136 | `--help'
|
|---|
| 137 | Print a usage message briefly summarizing these command-line
|
|---|
| 138 | options and the bug-reporting address, then exit.
|
|---|
| 139 |
|
|---|
| 140 | `-n'
|
|---|
| 141 | `--quiet'
|
|---|
| 142 | `--silent'
|
|---|
| 143 | By default, `sed' prints out the pattern space at the end of each
|
|---|
| 144 | cycle through the script. These options disable this automatic
|
|---|
| 145 | printing, and `sed' only produces output when explicitly told to
|
|---|
| 146 | via the `p' command.
|
|---|
| 147 |
|
|---|
| 148 | `-i[SUFFIX]'
|
|---|
| 149 | `--in-place[=SUFFIX]'
|
|---|
| 150 | This option specifies that files are to be edited in-place. GNU
|
|---|
| 151 | `sed' does this by creating a temporary file and sending output to
|
|---|
| 152 | this file rather than to the standard output.(1).
|
|---|
| 153 |
|
|---|
| 154 | This option implies `-s'.
|
|---|
| 155 |
|
|---|
| 156 | When the end of the file is reached, the temporary file is renamed
|
|---|
| 157 | to the output file's original name. The extension, if supplied,
|
|---|
| 158 | is used to modify the name of the old file before renaming the
|
|---|
| 159 | temporary file, thereby making a backup copy(2)).
|
|---|
| 160 |
|
|---|
| 161 | This rule is followed: if the extension doesn't contain a `*',
|
|---|
| 162 | then it is appended to the end of the current filename as a
|
|---|
| 163 | suffix; if the extension does contain one or more `*' characters,
|
|---|
| 164 | then _each_ asterisk is replaced with the current filename. This
|
|---|
| 165 | allows you to add a prefix to the backup file, instead of (or in
|
|---|
| 166 | addition to) a suffix, or even to place backup copies of the
|
|---|
| 167 | original files into another directory (provided the directory
|
|---|
| 168 | already exists).
|
|---|
| 169 |
|
|---|
| 170 | If no extension is supplied, the original file is overwritten
|
|---|
| 171 | without making a backup.
|
|---|
| 172 |
|
|---|
| 173 | `-l N'
|
|---|
| 174 | `--line-length=N'
|
|---|
| 175 | Specify the default line-wrap length for the `l' command. A
|
|---|
| 176 | length of 0 (zero) means to never wrap long lines. If not
|
|---|
| 177 | specified, it is taken to be 70.
|
|---|
| 178 |
|
|---|
| 179 | `--posix'
|
|---|
| 180 | GNU `sed' includes several extensions to POSIX sed. In order to
|
|---|
| 181 | simplify writing portable scripts, this option disables all the
|
|---|
| 182 | extensions that this manual documents, including additional
|
|---|
| 183 | commands. Most of the extensions accept `sed' programs that are
|
|---|
| 184 | outside the syntax mandated by POSIX, but some of them (such as
|
|---|
| 185 | the behavior of the `N' command described in *note Reporting
|
|---|
| 186 | Bugs::) actually violate the standard. If you want to disable
|
|---|
| 187 | only the latter kind of extension, you can set the
|
|---|
| 188 | `POSIXLY_CORRECT' variable to a non-empty value.
|
|---|
| 189 |
|
|---|
| 190 | `-r'
|
|---|
| 191 | `--regexp-extended'
|
|---|
| 192 | Use extended regular expressions rather than basic regular
|
|---|
| 193 | expressions. Extended regexps are those that `egrep' accepts;
|
|---|
| 194 | they can be clearer because they usually have less backslashes,
|
|---|
| 195 | but are a GNU extension and hence scripts that use them are not
|
|---|
| 196 | portable. *Note Extended regular expressions: Extended regexps.
|
|---|
| 197 |
|
|---|
| 198 | `-s'
|
|---|
| 199 | `--separate'
|
|---|
| 200 | By default, `sed' will consider the files specified on the command
|
|---|
| 201 | line as a single continuous long stream. This GNU `sed' extension
|
|---|
| 202 | allows the user to consider them as separate files: range
|
|---|
| 203 | addresses (such as `/abc/,/def/') are not allowed to span several
|
|---|
| 204 | files, line numbers are relative to the start of each file, `$'
|
|---|
| 205 | refers to the last line of each file, and files invoked from the
|
|---|
| 206 | `R' commands are rewound at the start of each file.
|
|---|
| 207 |
|
|---|
| 208 | `-u'
|
|---|
| 209 | `--unbuffered'
|
|---|
| 210 | Buffer both input and output as minimally as practical. (This is
|
|---|
| 211 | particularly useful if the input is coming from the likes of `tail
|
|---|
| 212 | -f', and you wish to see the transformed output as soon as
|
|---|
| 213 | possible.)
|
|---|
| 214 |
|
|---|
| 215 | `-e SCRIPT'
|
|---|
| 216 | `--expression=SCRIPT'
|
|---|
| 217 | Add the commands in SCRIPT to the set of commands to be run while
|
|---|
| 218 | processing the input.
|
|---|
| 219 |
|
|---|
| 220 | `-f SCRIPT-FILE'
|
|---|
| 221 | `--file=SCRIPT-FILE'
|
|---|
| 222 | Add the commands contained in the file SCRIPT-FILE to the set of
|
|---|
| 223 | commands to be run while processing the input.
|
|---|
| 224 |
|
|---|
| 225 |
|
|---|
| 226 | If no `-e', `-f', `--expression', or `--file' options are given on
|
|---|
| 227 | the command-line, then the first non-option argument on the command
|
|---|
| 228 | line is taken to be the SCRIPT to be executed.
|
|---|
| 229 |
|
|---|
| 230 | If any command-line parameters remain after processing the above,
|
|---|
| 231 | these parameters are interpreted as the names of input files to be
|
|---|
| 232 | processed. A file name of `-' refers to the standard input stream.
|
|---|
| 233 | The standard input will be processed if no file names are specified.
|
|---|
| 234 |
|
|---|
| 235 | ---------- Footnotes ----------
|
|---|
| 236 |
|
|---|
| 237 | (1) This applies to commands such as `=', `a', `c', `i', `l', `p'.
|
|---|
| 238 | You can still write to the standard output by using the `w' or `W'
|
|---|
| 239 | commands together with the `/dev/stdout' special file
|
|---|
| 240 |
|
|---|
| 241 | (2) Note that GNU `sed' creates the backup file whether or not
|
|---|
| 242 | any output is actually changed.
|
|---|
| 243 |
|
|---|
| 244 |
|
|---|
| 245 | File: sed.info, Node: sed Programs, Next: Examples, Prev: Invoking sed, Up: Top
|
|---|
| 246 |
|
|---|
| 247 | `sed' Programs
|
|---|
| 248 | **************
|
|---|
| 249 |
|
|---|
| 250 | A `sed' program consists of one or more `sed' commands, passed in by
|
|---|
| 251 | one or more of the `-e', `-f', `--expression', and `--file' options, or
|
|---|
| 252 | the first non-option argument if zero of these options are used. This
|
|---|
| 253 | document will refer to "the" `sed' script; this is understood to mean
|
|---|
| 254 | the in-order catenation of all of the SCRIPTs and SCRIPT-FILEs passed
|
|---|
| 255 | in.
|
|---|
| 256 |
|
|---|
| 257 | Each `sed' command consists of an optional address or address range,
|
|---|
| 258 | followed by a one-character command name and any additional
|
|---|
| 259 | command-specific code.
|
|---|
| 260 |
|
|---|
| 261 | * Menu:
|
|---|
| 262 |
|
|---|
| 263 | * Execution Cycle:: How `sed' works
|
|---|
| 264 | * Addresses:: Selecting lines with `sed'
|
|---|
| 265 | * Regular Expressions:: Overview of regular expression syntax
|
|---|
| 266 | * Common Commands:: Often used commands
|
|---|
| 267 | * The "s" Command:: `sed''s Swiss Army Knife
|
|---|
| 268 | * Other Commands:: Less frequently used commands
|
|---|
| 269 | * Programming Commands:: Commands for `sed' gurus
|
|---|
| 270 | * Extended Commands:: Commands specific of GNU `sed'
|
|---|
| 271 | * Escapes:: Specifying special characters
|
|---|
| 272 |
|
|---|
| 273 |
|
|---|
| 274 | File: sed.info, Node: Execution Cycle, Next: Addresses, Up: sed Programs
|
|---|
| 275 |
|
|---|
| 276 | How `sed' Works
|
|---|
| 277 | ===============
|
|---|
| 278 |
|
|---|
| 279 | `sed' maintains two data buffers: the active _pattern_ space, and
|
|---|
| 280 | the auxiliary _hold_ space. Both are initially empty.
|
|---|
| 281 |
|
|---|
| 282 | `sed' operates by performing the following cycle on each lines of
|
|---|
| 283 | input: first, `sed' reads one line from the input stream, removes any
|
|---|
| 284 | trailing newline, and places it in the pattern space. Then commands
|
|---|
| 285 | are executed; each command can have an address associated to it:
|
|---|
| 286 | addresses are a kind of condition code, and a command is only executed
|
|---|
| 287 | if the condition is verified before the command is to be executed.
|
|---|
| 288 |
|
|---|
| 289 | When the end of the script is reached, unless the `-n' option is in
|
|---|
| 290 | use, the contents of pattern space are printed out to the output
|
|---|
| 291 | stream, adding back the trailing newline if it was removed.(1) Then the
|
|---|
| 292 | next cycle starts for the next input line.
|
|---|
| 293 |
|
|---|
| 294 | Unless special commands (like `D') are used, the pattern space is
|
|---|
| 295 | deleted between two cycles. The hold space, on the other hand, keeps
|
|---|
| 296 | its data between cycles (see commands `h', `H', `x', `g', `G' to move
|
|---|
| 297 | data between both buffers).
|
|---|
| 298 |
|
|---|
| 299 | ---------- Footnotes ----------
|
|---|
| 300 |
|
|---|
| 301 | (1) Actually, if `sed' prints a line without the terminating
|
|---|
| 302 | newline, it will nevertheless print the missing newline as soon as
|
|---|
| 303 | more text is sent to the same output stream, which gives the "least
|
|---|
| 304 | expected surprise" even though it does not make commands like `sed -n
|
|---|
| 305 | p' exactly identical to `cat'.
|
|---|
| 306 |
|
|---|
| 307 |
|
|---|
| 308 | File: sed.info, Node: Addresses, Next: Regular Expressions, Prev: Execution Cycle, Up: sed Programs
|
|---|
| 309 |
|
|---|
| 310 | Selecting lines with `sed'
|
|---|
| 311 | ==========================
|
|---|
| 312 |
|
|---|
| 313 | Addresses in a `sed' script can be in any of the following forms:
|
|---|
| 314 | `NUMBER'
|
|---|
| 315 | Specifying a line number will match only that line in the input.
|
|---|
| 316 | (Note that `sed' counts lines continuously across all input files
|
|---|
| 317 | unless `-i' or `-s' options are specified.)
|
|---|
| 318 |
|
|---|
| 319 | `FIRST~STEP'
|
|---|
| 320 | This GNU extension matches every STEPth line starting with line
|
|---|
| 321 | FIRST. In particular, lines will be selected when there exists a
|
|---|
| 322 | non-negative N such that the current line-number equals FIRST + (N
|
|---|
| 323 | * STEP). Thus, to select the odd-numbered lines, one would use
|
|---|
| 324 | `1~2'; to pick every third line starting with the second, `2~3'
|
|---|
| 325 | would be used; to pick every fifth line starting with the tenth,
|
|---|
| 326 | use `10~5'; and `50~0' is just an obscure way of saying `50'.
|
|---|
| 327 |
|
|---|
| 328 | `$'
|
|---|
| 329 | This address matches the last line of the last file of input, or
|
|---|
| 330 | the last line of each file when the `-i' or `-s' options are
|
|---|
| 331 | specified.
|
|---|
| 332 |
|
|---|
| 333 | `/REGEXP/'
|
|---|
| 334 | This will select any line which matches the regular expression
|
|---|
| 335 | REGEXP. If REGEXP itself includes any `/' characters, each must
|
|---|
| 336 | be escaped by a backslash (`\').
|
|---|
| 337 |
|
|---|
| 338 | The empty regular expression `//' repeats the last regular
|
|---|
| 339 | expression match (the same holds if the empty regular expression is
|
|---|
| 340 | passed to the `s' command). Note that modifiers to regular
|
|---|
| 341 | expressions are evaluated when the regular expression is compiled,
|
|---|
| 342 | thus it is invalid to specify them together with the empty regular
|
|---|
| 343 | expression.
|
|---|
| 344 |
|
|---|
| 345 | `\%REGEXP%'
|
|---|
| 346 | (The `%' may be replaced by any other single character.)
|
|---|
| 347 |
|
|---|
| 348 | This also matches the regular expression REGEXP, but allows one to
|
|---|
| 349 | use a different delimiter than `/'. This is particularly useful
|
|---|
| 350 | if the REGEXP itself contains a lot of slashes, since it avoids
|
|---|
| 351 | the tedious escaping of every `/'. If REGEXP itself includes any
|
|---|
| 352 | delimiter characters, each must be escaped by a backslash (`\').
|
|---|
| 353 |
|
|---|
| 354 | `/REGEXP/I'
|
|---|
| 355 | `\%REGEXP%I'
|
|---|
| 356 | The `I' modifier to regular-expression matching is a GNU extension
|
|---|
| 357 | which causes the REGEXP to be matched in a case-insensitive manner.
|
|---|
| 358 |
|
|---|
| 359 | `/REGEXP/M'
|
|---|
| 360 | `\%REGEXP%M'
|
|---|
| 361 | The `M' modifier to regular-expression matching is a GNU `sed'
|
|---|
| 362 | extension which causes `^' and `$' to match respectively (in
|
|---|
| 363 | addition to the normal behavior) the empty string after a newline,
|
|---|
| 364 | and the empty string before a newline. There are special character
|
|---|
| 365 | sequences (`\`' and `\'') which always match the beginning or the
|
|---|
| 366 | end of the buffer. `M' stands for `multi-line'.
|
|---|
| 367 |
|
|---|
| 368 |
|
|---|
| 369 | If no addresses are given, then all lines are matched; if one
|
|---|
| 370 | address is given, then only lines matching that address are matched.
|
|---|
| 371 |
|
|---|
| 372 | An address range can be specified by specifying two addresses
|
|---|
| 373 | separated by a comma (`,'). An address range matches lines starting
|
|---|
| 374 | from where the first address matches, and continues until the second
|
|---|
| 375 | address matches (inclusively).
|
|---|
| 376 |
|
|---|
| 377 | If the second address is a REGEXP, then checking for the ending
|
|---|
| 378 | match will start with the line _following_ the line which matched the
|
|---|
| 379 | first address: a range will always span at least two lines (except of
|
|---|
| 380 | course if the input stream ends).
|
|---|
| 381 |
|
|---|
| 382 | If the second address is a NUMBER less than (or equal to) the line
|
|---|
| 383 | matching the first address, then only the one line is matched.
|
|---|
| 384 |
|
|---|
| 385 | GNU `sed' also supports some special two-address forms; all these
|
|---|
| 386 | are GNU extensions:
|
|---|
| 387 | `0,/REGEXP/'
|
|---|
| 388 | A line number of `0' can be used in an address specification like
|
|---|
| 389 | `0,/REGEXP/' so that `sed' will try to match REGEXP in the first
|
|---|
| 390 | input line too. In other words, `0,/REGEXP/' is similar to
|
|---|
| 391 | `1,/REGEXP/', except that if ADDR2 matches the very first line of
|
|---|
| 392 | input the `0,/REGEXP/' form will consider it to end the range,
|
|---|
| 393 | whereas the `1,/REGEXP/' form will match the beginning of its
|
|---|
| 394 | range and hence make the range span up to the _second_ occurrence
|
|---|
| 395 | of the regular expression.
|
|---|
| 396 |
|
|---|
| 397 | Note that this is the only place where the `0' address makes
|
|---|
| 398 | sense; there is no 0-th line and commands which are given the `0'
|
|---|
| 399 | address in any other way will give an error.
|
|---|
| 400 |
|
|---|
| 401 | `ADDR1,+N'
|
|---|
| 402 | Matches ADDR1 and the N lines following ADDR1.
|
|---|
| 403 |
|
|---|
| 404 | `ADDR1,~N'
|
|---|
| 405 | Matches ADDR1 and the lines following ADDR1 until the next line
|
|---|
| 406 | whose input line number is a multiple of N.
|
|---|
| 407 |
|
|---|
| 408 | Appending the `!' character to the end of an address specification
|
|---|
| 409 | negates the sense of the match. That is, if the `!' character follows
|
|---|
| 410 | an address range, then only lines which do _not_ match the address range
|
|---|
| 411 | will be selected. This also works for singleton addresses, and,
|
|---|
| 412 | perhaps perversely, for the null address.
|
|---|
| 413 |
|
|---|
| 414 |
|
|---|
| 415 | File: sed.info, Node: Regular Expressions, Next: Common Commands, Prev: Addresses, Up: sed Programs
|
|---|
| 416 |
|
|---|
| 417 | Overview of Regular Expression Syntax
|
|---|
| 418 | =====================================
|
|---|
| 419 |
|
|---|
| 420 | To know how to use `sed', people should understand regular
|
|---|
| 421 | expressions ("regexp" for short). A regular expression is a pattern
|
|---|
| 422 | that is matched against a subject string from left to right. Most
|
|---|
| 423 | characters are "ordinary": they stand for themselves in a pattern, and
|
|---|
| 424 | match the corresponding characters in the subject. As a trivial
|
|---|
| 425 | example, the pattern
|
|---|
| 426 |
|
|---|
| 427 | The quick brown fox
|
|---|
| 428 |
|
|---|
| 429 | matches a portion of a subject string that is identical to itself. The
|
|---|
| 430 | power of regular expressions comes from the ability to include
|
|---|
| 431 | alternatives and repetitions in the pattern. These are encoded in the
|
|---|
| 432 | pattern by the use of "special characters", which do not stand for
|
|---|
| 433 | themselves but instead are interpreted in some special way. Here is a
|
|---|
| 434 | brief description of regular expression syntax as used in `sed'.
|
|---|
| 435 |
|
|---|
| 436 | `CHAR'
|
|---|
| 437 | A single ordinary character matches itself.
|
|---|
| 438 |
|
|---|
| 439 | `*'
|
|---|
| 440 | Matches a sequence of zero or more instances of matches for the
|
|---|
| 441 | preceding regular expression, which must be an ordinary character,
|
|---|
| 442 | a special character preceded by `\', a `.', a grouped regexp (see
|
|---|
| 443 | below), or a bracket expression. As a GNU extension, a postfixed
|
|---|
| 444 | regular expression can also be followed by `*'; for example, `a**'
|
|---|
| 445 | is equivalent to `a*'. POSIX 1003.1-2001 says that `*' stands for
|
|---|
| 446 | itself when it appears at the start of a regular expression or
|
|---|
| 447 | subexpression, but many nonGNU implementations do not support this
|
|---|
| 448 | and portable scripts should instead use `\*' in these contexts.
|
|---|
| 449 |
|
|---|
| 450 | `\+'
|
|---|
| 451 | As `*', but matches one or more. It is a GNU extension.
|
|---|
| 452 |
|
|---|
| 453 | `\?'
|
|---|
| 454 | As `*', but only matches zero or one. It is a GNU extension.
|
|---|
| 455 |
|
|---|
| 456 | `\{I\}'
|
|---|
| 457 | As `*', but matches exactly I sequences (I is a decimal integer;
|
|---|
| 458 | for portability, keep it between 0 and 255 inclusive).
|
|---|
| 459 |
|
|---|
| 460 | `\{I,J\}'
|
|---|
| 461 | Matches between I and J, inclusive, sequences.
|
|---|
| 462 |
|
|---|
| 463 | `\{I,\}'
|
|---|
| 464 | Matches more than or equal to I sequences.
|
|---|
| 465 |
|
|---|
| 466 | `\(REGEXP\)'
|
|---|
| 467 | Groups the inner REGEXP as a whole, this is used to:
|
|---|
| 468 |
|
|---|
| 469 | * Apply postfix operators, like `\(abcd\)*': this will search
|
|---|
| 470 | for zero or more whole sequences of `abcd', while `abcd*'
|
|---|
| 471 | would search for `abc' followed by zero or more occurrences
|
|---|
| 472 | of `d'. Note that support for `\(abcd\)*' is required by
|
|---|
| 473 | POSIX 1003.1-2001, but many non-GNU implementations do not
|
|---|
| 474 | support it and hence it is not universally portable.
|
|---|
| 475 |
|
|---|
| 476 | * Use back references (see below).
|
|---|
| 477 |
|
|---|
| 478 | `.'
|
|---|
| 479 | Matches any character, including newline.
|
|---|
| 480 |
|
|---|
| 481 | `^'
|
|---|
| 482 | Matches the null string at beginning of line, i.e. what appears
|
|---|
| 483 | after the circumflex must appear at the beginning of line.
|
|---|
| 484 | `^#include' will match only lines where `#include' is the first
|
|---|
| 485 | thing on line--if there are spaces before, for example, the match
|
|---|
| 486 | fails. `^' acts as a special character only at the beginning of
|
|---|
| 487 | the regular expression or subexpression (that is, after `\(' or
|
|---|
| 488 | `\|'). Portable scripts should avoid `^' at the beginning of a
|
|---|
| 489 | subexpression, though, as POSIX allows implementations that treat
|
|---|
| 490 | `^' as an ordinary character in that context.
|
|---|
| 491 |
|
|---|
| 492 | `$'
|
|---|
| 493 | It is the same as `^', but refers to end of line. `$' also acts
|
|---|
| 494 | as a special character only at the end of the regular expression
|
|---|
| 495 | or subexpression (that is, before `\)' or `\|'), and its use at
|
|---|
| 496 | the end of a subexpression is not portable.
|
|---|
| 497 |
|
|---|
| 498 | `[LIST]'
|
|---|
| 499 | `[^LIST]'
|
|---|
| 500 | Matches any single character in LIST: for example, `[aeiou]'
|
|---|
| 501 | matches all vowels. A list may include sequences like
|
|---|
| 502 | `CHAR1-CHAR2', which matches any character between (inclusive)
|
|---|
| 503 | CHAR1 and CHAR2.
|
|---|
| 504 |
|
|---|
| 505 | A leading `^' reverses the meaning of LIST, so that it matches any
|
|---|
| 506 | single character _not_ in LIST. To include `]' in the list, make
|
|---|
| 507 | it the first character (after the `^' if needed), to include `-'
|
|---|
| 508 | in the list, make it the first or last; to include `^' put it
|
|---|
| 509 | after the first character.
|
|---|
| 510 |
|
|---|
| 511 | The characters `$', `*', `.', `[', and `\' are normally not
|
|---|
| 512 | special within LIST. For example, `[\*]' matches either `\' or
|
|---|
| 513 | `*', because the `\' is not special here. However, strings like
|
|---|
| 514 | `[.ch.]', `[=a=]', and `[:space:]' are special within LIST and
|
|---|
| 515 | represent collating symbols, equivalence classes, and character
|
|---|
| 516 | classes, respectively, and `[' is therefore special within LIST
|
|---|
| 517 | when it is followed by `.', `=', or `:'. Also, when not in
|
|---|
| 518 | `POSIXLY_CORRECT' mode, special escapes like `\n' and `\t' are
|
|---|
| 519 | recognized within LIST. *Note Escapes::.
|
|---|
| 520 |
|
|---|
| 521 | `REGEXP1\|REGEXP2'
|
|---|
| 522 | Matches either REGEXP1 or REGEXP2. Use parentheses to use complex
|
|---|
| 523 | alternative regular expressions. The matching process tries each
|
|---|
| 524 | alternative in turn, from left to right, and the first one that
|
|---|
| 525 | succeeds is used. It is a GNU extension.
|
|---|
| 526 |
|
|---|
| 527 | `REGEXP1REGEXP2'
|
|---|
| 528 | Matches the concatenation of REGEXP1 and REGEXP2. Concatenation
|
|---|
| 529 | binds more tightly than `\|', `^', and `$', but less tightly than
|
|---|
| 530 | the other regular expression operators.
|
|---|
| 531 |
|
|---|
| 532 | `\DIGIT'
|
|---|
| 533 | Matches the DIGIT-th `\(...\)' parenthesized subexpression in the
|
|---|
| 534 | regular expression. This is called a "back reference".
|
|---|
| 535 | Subexpressions are implicity numbered by counting occurrences of
|
|---|
| 536 | `\(' left-to-right.
|
|---|
| 537 |
|
|---|
| 538 | `\n'
|
|---|
| 539 | Matches the newline character.
|
|---|
| 540 |
|
|---|
| 541 | `\CHAR'
|
|---|
| 542 | Matches CHAR, where CHAR is one of `$', `*', `.', `[', `\', or `^'.
|
|---|
| 543 | Note that the only C-like backslash sequences that you can
|
|---|
| 544 | portably assume to be interpreted are `\n' and `\\'; in particular
|
|---|
| 545 | `\t' is not portable, and matches a `t' under most implementations
|
|---|
| 546 | of `sed', rather than a tab character.
|
|---|
| 547 |
|
|---|
| 548 |
|
|---|
| 549 | Note that the regular expression matcher is greedy, i.e., matches
|
|---|
| 550 | are attempted from left to right and, if two or more matches are
|
|---|
| 551 | possible starting at the same character, it selects the longest.
|
|---|
| 552 |
|
|---|
| 553 | Examples:
|
|---|
| 554 | `abcdef'
|
|---|
| 555 | Matches `abcdef'.
|
|---|
| 556 |
|
|---|
| 557 | `a*b'
|
|---|
| 558 | Matches zero or more `a's followed by a single `b'. For example,
|
|---|
| 559 | `b' or `aaaaab'.
|
|---|
| 560 |
|
|---|
| 561 | `a\?b'
|
|---|
| 562 | Matches `b' or `ab'.
|
|---|
| 563 |
|
|---|
| 564 | `a\+b\+'
|
|---|
| 565 | Matches one or more `a's followed by one or more `b's: `ab' is the
|
|---|
| 566 | shortest possible match, but other examples are `aaaab' or
|
|---|
| 567 | `abbbbb' or `aaaaaabbbbbbb'.
|
|---|
| 568 |
|
|---|
| 569 | `.*'
|
|---|
| 570 | `.\+'
|
|---|
| 571 | These two both match all the characters in a string; however, the
|
|---|
| 572 | first matches every string (including the empty string), while the
|
|---|
| 573 | second matches only strings containing at least one character.
|
|---|
| 574 |
|
|---|
| 575 | `^main.*(.*)'
|
|---|
| 576 | his matches a string starting with `main', followed by an opening
|
|---|
| 577 | and closing parenthesis. The `n', `(' and `)' need not be
|
|---|
| 578 | adjacent.
|
|---|
| 579 |
|
|---|
| 580 | `^#'
|
|---|
| 581 | This matches a string beginning with `#'.
|
|---|
| 582 |
|
|---|
| 583 | `\\$'
|
|---|
| 584 | This matches a string ending with a single backslash. The regexp
|
|---|
| 585 | contains two backslashes for escaping.
|
|---|
| 586 |
|
|---|
| 587 | `\$'
|
|---|
| 588 | Instead, this matches a string consisting of a single dollar sign,
|
|---|
| 589 | because it is escaped.
|
|---|
| 590 |
|
|---|
| 591 | `[a-zA-Z0-9]'
|
|---|
| 592 | In the C locale, this matches any ASCII letters or digits.
|
|---|
| 593 |
|
|---|
| 594 | `[^ tab]\+'
|
|---|
| 595 | (Here `tab' stands for a single tab character.) This matches a
|
|---|
| 596 | string of one or more characters, none of which is a space or a
|
|---|
| 597 | tab. Usually this means a word.
|
|---|
| 598 |
|
|---|
| 599 | `^\(.*\)\n\1$'
|
|---|
| 600 | This matches a string consisting of two equal substrings separated
|
|---|
| 601 | by a newline.
|
|---|
| 602 |
|
|---|
| 603 | `.\{9\}A$'
|
|---|
| 604 | This matches nine characters followed by an `A'.
|
|---|
| 605 |
|
|---|
| 606 | `^.\{15\}A'
|
|---|
| 607 | This matches the start of a string that contains 16 characters,
|
|---|
| 608 | the last of which is an `A'.
|
|---|
| 609 |
|
|---|
| 610 |
|
|---|
| 611 |
|
|---|
| 612 | File: sed.info, Node: Common Commands, Next: The "s" Command, Prev: Regular Expressions, Up: sed Programs
|
|---|
| 613 |
|
|---|
| 614 | Often-Used Commands
|
|---|
| 615 | ===================
|
|---|
| 616 |
|
|---|
| 617 | If you use `sed' at all, you will quite likely want to know these
|
|---|
| 618 | commands.
|
|---|
| 619 |
|
|---|
| 620 | `#'
|
|---|
| 621 | [No addresses allowed.]
|
|---|
| 622 |
|
|---|
| 623 | The `#' character begins a comment; the comment continues until
|
|---|
| 624 | the next newline.
|
|---|
| 625 |
|
|---|
| 626 | If you are concerned about portability, be aware that some
|
|---|
| 627 | implementations of `sed' (which are not POSIX conformant) may only
|
|---|
| 628 | support a single one-line comment, and then only when the very
|
|---|
| 629 | first character of the script is a `#'.
|
|---|
| 630 |
|
|---|
| 631 | Warning: if the first two characters of the `sed' script are `#n',
|
|---|
| 632 | then the `-n' (no-autoprint) option is forced. If you want to put
|
|---|
| 633 | a comment in the first line of your script and that comment begins
|
|---|
| 634 | with the letter `n' and you do not want this behavior, then be
|
|---|
| 635 | sure to either use a capital `N', or place at least one space
|
|---|
| 636 | before the `n'.
|
|---|
| 637 |
|
|---|
| 638 | `q [EXIT-CODE]'
|
|---|
| 639 | This command only accepts a single address.
|
|---|
| 640 |
|
|---|
| 641 | Exit `sed' without processing any more commands or input. Note
|
|---|
| 642 | that the current pattern space is printed if auto-print is not
|
|---|
| 643 | disabled with the `-n' options. The ability to return an exit
|
|---|
| 644 | code from the `sed' script is a GNU `sed' extension.
|
|---|
| 645 |
|
|---|
| 646 | `d'
|
|---|
| 647 | Delete the pattern space; immediately start next cycle.
|
|---|
| 648 |
|
|---|
| 649 | `p'
|
|---|
| 650 | Print out the pattern space (to the standard output). This
|
|---|
| 651 | command is usually only used in conjunction with the `-n'
|
|---|
| 652 | command-line option.
|
|---|
| 653 |
|
|---|
| 654 | `n'
|
|---|
| 655 | If auto-print is not disabled, print the pattern space, then,
|
|---|
| 656 | regardless, replace the pattern space with the next line of input.
|
|---|
| 657 | If there is no more input then `sed' exits without processing any
|
|---|
| 658 | more commands.
|
|---|
| 659 |
|
|---|
| 660 | `{ COMMANDS }'
|
|---|
| 661 | A group of commands may be enclosed between `{' and `}' characters.
|
|---|
| 662 | This is particularly useful when you want a group of commands to
|
|---|
| 663 | be triggered by a single address (or address-range) match.
|
|---|
| 664 |
|
|---|
| 665 |
|
|---|
| 666 |
|
|---|
| 667 | File: sed.info, Node: The "s" Command, Next: Other Commands, Prev: Common Commands, Up: sed Programs
|
|---|
| 668 |
|
|---|
| 669 | The `s' Command
|
|---|
| 670 | ===============
|
|---|
| 671 |
|
|---|
| 672 | The syntax of the `s' (as in substitute) command is
|
|---|
| 673 | `s/REGEXP/REPLACEMENT/FLAGS'. The `/' characters may be uniformly
|
|---|
| 674 | replaced by any other single character within any given `s' command.
|
|---|
| 675 | The `/' character (or whatever other character is used in its stead)
|
|---|
| 676 | can appear in the REGEXP or REPLACEMENT only if it is preceded by a `\'
|
|---|
| 677 | character.
|
|---|
| 678 |
|
|---|
| 679 | The `s' command is probably the most important in `sed' and has a
|
|---|
| 680 | lot of different options. Its basic concept is simple: the `s' command
|
|---|
| 681 | attempts to match the pattern space against the supplied REGEXP; if the
|
|---|
| 682 | match is successful, then that portion of the pattern space which was
|
|---|
| 683 | matched is replaced with REPLACEMENT.
|
|---|
| 684 |
|
|---|
| 685 | The REPLACEMENT can contain `\N' (N being a number from 1 to 9,
|
|---|
| 686 | inclusive) references, which refer to the portion of the match which is
|
|---|
| 687 | contained between the Nth `\(' and its matching `\)'. Also, the
|
|---|
| 688 | REPLACEMENT can contain unescaped `&' characters which reference the
|
|---|
| 689 | whole matched portion of the pattern space. Finally, as a GNU `sed'
|
|---|
| 690 | extension, you can include a special sequence made of a backslash and
|
|---|
| 691 | one of the letters `L', `l', `U', `u', or `E'. The meaning is as
|
|---|
| 692 | follows:
|
|---|
| 693 |
|
|---|
| 694 | `\L'
|
|---|
| 695 | Turn the replacement to lowercase until a `\U' or `\E' is found,
|
|---|
| 696 |
|
|---|
| 697 | `\l'
|
|---|
| 698 | Turn the next character to lowercase,
|
|---|
| 699 |
|
|---|
| 700 | `\U'
|
|---|
| 701 | Turn the replacement to uppercase until a `\L' or `\E' is found,
|
|---|
| 702 |
|
|---|
| 703 | `\u'
|
|---|
| 704 | Turn the next character to uppercase,
|
|---|
| 705 |
|
|---|
| 706 | `\E'
|
|---|
| 707 | Stop case conversion started by `\L' or `\U'.
|
|---|
| 708 |
|
|---|
| 709 | To include a literal `\', `&', or newline in the final replacement,
|
|---|
| 710 | be sure to precede the desired `\', `&', or newline in the REPLACEMENT
|
|---|
| 711 | with a `\'.
|
|---|
| 712 |
|
|---|
| 713 | The `s' command can be followed by zero or more of the following
|
|---|
| 714 | FLAGS:
|
|---|
| 715 |
|
|---|
| 716 | `g'
|
|---|
| 717 | Apply the replacement to _all_ matches to the REGEXP, not just the
|
|---|
| 718 | first.
|
|---|
| 719 |
|
|---|
| 720 | `NUMBER'
|
|---|
| 721 | Only replace the NUMBERth match of the REGEXP.
|
|---|
| 722 |
|
|---|
| 723 | Note: the POSIX standard does not specify what should happen when
|
|---|
| 724 | you mix the `g' and NUMBER modifiers, and currently there is no
|
|---|
| 725 | widely agreed upon meaning across `sed' implementations. For GNU
|
|---|
| 726 | `sed', the interaction is defined to be: ignore matches before the
|
|---|
| 727 | NUMBERth, and then match and replace all matches from the NUMBERth
|
|---|
| 728 | on.
|
|---|
| 729 |
|
|---|
| 730 | `p'
|
|---|
| 731 | If the substitution was made, then print the new pattern space.
|
|---|
| 732 |
|
|---|
| 733 | Note: when both the `p' and `e' options are specified, the
|
|---|
| 734 | relative ordering of the two produces very different results. In
|
|---|
| 735 | general, `ep' (evaluate then print) is what you want, but
|
|---|
| 736 | operating the other way round can be useful for debugging. For
|
|---|
| 737 | this reason, the current version of GNU `sed' interprets specially
|
|---|
| 738 | the presence of `p' options both before and after `e', printing
|
|---|
| 739 | the pattern space before and after evaluation, while in general
|
|---|
| 740 | flags for the `s' command show their effect just once. This
|
|---|
| 741 | behavior, although documented, might change in future versions.
|
|---|
| 742 |
|
|---|
| 743 | `w FILE-NAME'
|
|---|
| 744 | If the substitution was made, then write out the result to the
|
|---|
| 745 | named file. As a GNU `sed' extension, two special values of
|
|---|
| 746 | FILE-NAME are supported: `/dev/stderr', which writes the result to
|
|---|
| 747 | the standard error, and `/dev/stdout', which writes to the standard
|
|---|
| 748 | output.(1)
|
|---|
| 749 |
|
|---|
| 750 | `e'
|
|---|
| 751 | This command allows one to pipe input from a shell command into
|
|---|
| 752 | pattern space. If a substitution was made, the command that is
|
|---|
| 753 | found in pattern space is executed and pattern space is replaced
|
|---|
| 754 | with its output. A trailing newline is suppressed; results are
|
|---|
| 755 | undefined if the command to be executed contains a NUL character.
|
|---|
| 756 | This is a GNU `sed' extension.
|
|---|
| 757 |
|
|---|
| 758 | `I'
|
|---|
| 759 | `i'
|
|---|
| 760 | The `I' modifier to regular-expression matching is a GNU extension
|
|---|
| 761 | which makes `sed' match REGEXP in a case-insensitive manner.
|
|---|
| 762 |
|
|---|
| 763 | `M'
|
|---|
| 764 | `m'
|
|---|
| 765 | The `M' modifier to regular-expression matching is a GNU `sed'
|
|---|
| 766 | extension which causes `^' and `$' to match respectively (in
|
|---|
| 767 | addition to the normal behavior) the empty string after a newline,
|
|---|
| 768 | and the empty string before a newline. There are special character
|
|---|
| 769 | sequences (`\`' and `\'') which always match the beginning or the
|
|---|
| 770 | end of the buffer. `M' stands for `multi-line'.
|
|---|
| 771 |
|
|---|
| 772 |
|
|---|
| 773 | ---------- Footnotes ----------
|
|---|
| 774 |
|
|---|
| 775 | (1) This is equivalent to `p' unless the `-i' option is being used.
|
|---|
| 776 |
|
|---|
| 777 |
|
|---|
| 778 | File: sed.info, Node: Other Commands, Next: Programming Commands, Prev: The "s" Command, Up: sed Programs
|
|---|
| 779 |
|
|---|
| 780 | Less Frequently-Used Commands
|
|---|
| 781 | =============================
|
|---|
| 782 |
|
|---|
| 783 | Though perhaps less frequently used than those in the previous
|
|---|
| 784 | section, some very small yet useful `sed' scripts can be built with
|
|---|
| 785 | these commands.
|
|---|
| 786 |
|
|---|
| 787 | `y/SOURCE-CHARS/DEST-CHARS/'
|
|---|
| 788 | (The `/' characters may be uniformly replaced by any other single
|
|---|
| 789 | character within any given `y' command.)
|
|---|
| 790 |
|
|---|
| 791 | Transliterate any characters in the pattern space which match any
|
|---|
| 792 | of the SOURCE-CHARS with the corresponding character in DEST-CHARS.
|
|---|
| 793 |
|
|---|
| 794 | Instances of the `/' (or whatever other character is used in its
|
|---|
| 795 | stead), `\', or newlines can appear in the SOURCE-CHARS or
|
|---|
| 796 | DEST-CHARS lists, provide that each instance is escaped by a `\'.
|
|---|
| 797 | The SOURCE-CHARS and DEST-CHARS lists _must_ contain the same
|
|---|
| 798 | number of characters (after de-escaping).
|
|---|
| 799 |
|
|---|
| 800 | `a\'
|
|---|
| 801 | `TEXT'
|
|---|
| 802 | As a GNU extension, this command accepts two addresses.
|
|---|
| 803 |
|
|---|
| 804 | Queue the lines of text which follow this command (each but the
|
|---|
| 805 | last ending with a `\', which are removed from the output) to be
|
|---|
| 806 | output at the end of the current cycle, or when the next input
|
|---|
| 807 | line is read.
|
|---|
| 808 |
|
|---|
| 809 | Escape sequences in TEXT are processed, so you should use `\\' in
|
|---|
| 810 | TEXT to print a single backslash.
|
|---|
| 811 |
|
|---|
| 812 | As a GNU extension, if between the `a' and the newline there is
|
|---|
| 813 | other than a whitespace-`\' sequence, then the text of this line,
|
|---|
| 814 | starting at the first non-whitespace character after the `a', is
|
|---|
| 815 | taken as the first line of the TEXT block. (This enables a
|
|---|
| 816 | simplification in scripting a one-line add.) This extension also
|
|---|
| 817 | works with the `i' and `c' commands.
|
|---|
| 818 |
|
|---|
| 819 | `i\'
|
|---|
| 820 | `TEXT'
|
|---|
| 821 | As a GNU extension, this command accepts two addresses.
|
|---|
| 822 |
|
|---|
| 823 | Immediately output the lines of text which follow this command
|
|---|
| 824 | (each but the last ending with a `\', which are removed from the
|
|---|
| 825 | output).
|
|---|
| 826 |
|
|---|
| 827 | `c\'
|
|---|
| 828 | `TEXT'
|
|---|
| 829 | Delete the lines matching the address or address-range, and output
|
|---|
| 830 | the lines of text which follow this command (each but the last
|
|---|
| 831 | ending with a `\', which are removed from the output) in place of
|
|---|
| 832 | the last line (or in place of each line, if no addresses were
|
|---|
| 833 | specified). A new cycle is started after this command is done,
|
|---|
| 834 | since the pattern space will have been deleted.
|
|---|
| 835 |
|
|---|
| 836 | `='
|
|---|
| 837 | As a GNU extension, this command accepts two addresses.
|
|---|
| 838 |
|
|---|
| 839 | Print out the current input line number (with a trailing newline).
|
|---|
| 840 |
|
|---|
| 841 | `l N'
|
|---|
| 842 | Print the pattern space in an unambiguous form: non-printable
|
|---|
| 843 | characters (and the `\' character) are printed in C-style escaped
|
|---|
| 844 | form; long lines are split, with a trailing `\' character to
|
|---|
| 845 | indicate the split; the end of each line is marked with a `$'.
|
|---|
| 846 |
|
|---|
| 847 | N specifies the desired line-wrap length; a length of 0 (zero)
|
|---|
| 848 | means to never wrap long lines. If omitted, the default as
|
|---|
| 849 | specified on the command line is used. The N parameter is a GNU
|
|---|
| 850 | `sed' extension.
|
|---|
| 851 |
|
|---|
| 852 | `r FILENAME'
|
|---|
| 853 | As a GNU extension, this command accepts two addresses.
|
|---|
| 854 |
|
|---|
| 855 | Queue the contents of FILENAME to be read and inserted into the
|
|---|
| 856 | output stream at the end of the current cycle, or when the next
|
|---|
| 857 | input line is read. Note that if FILENAME cannot be read, it is
|
|---|
| 858 | treated as if it were an empty file, without any error indication.
|
|---|
| 859 |
|
|---|
| 860 | As a GNU `sed' extension, the special value `/dev/stdin' is
|
|---|
| 861 | supported for the file name, which reads the contents of the
|
|---|
| 862 | standard input.
|
|---|
| 863 |
|
|---|
| 864 | `w FILENAME'
|
|---|
| 865 | Write the pattern space to FILENAME. As a GNU `sed' extension,
|
|---|
| 866 | two special values of FILE-NAME are supported: `/dev/stderr',
|
|---|
| 867 | which writes the result to the standard error, and `/dev/stdout',
|
|---|
| 868 | which writes to the standard output.(1)
|
|---|
| 869 |
|
|---|
| 870 | The file will be created (or truncated) before the first input
|
|---|
| 871 | line is read; all `w' commands (including instances of `w' flag on
|
|---|
| 872 | successful `s' commands) which refer to the same FILENAME are
|
|---|
| 873 | output without closing and reopening the file.
|
|---|
| 874 |
|
|---|
| 875 | `D'
|
|---|
| 876 | Delete text in the pattern space up to the first newline. If any
|
|---|
| 877 | text is left, restart cycle with the resultant pattern space
|
|---|
| 878 | (without reading a new line of input), otherwise start a normal
|
|---|
| 879 | new cycle.
|
|---|
| 880 |
|
|---|
| 881 | `N'
|
|---|
| 882 | Add a newline to the pattern space, then append the next line of
|
|---|
| 883 | input to the pattern space. If there is no more input then `sed'
|
|---|
| 884 | exits without processing any more commands.
|
|---|
| 885 |
|
|---|
| 886 | `P'
|
|---|
| 887 | Print out the portion of the pattern space up to the first newline.
|
|---|
| 888 |
|
|---|
| 889 | `h'
|
|---|
| 890 | Replace the contents of the hold space with the contents of the
|
|---|
| 891 | pattern space.
|
|---|
| 892 |
|
|---|
| 893 | `H'
|
|---|
| 894 | Append a newline to the contents of the hold space, and then
|
|---|
| 895 | append the contents of the pattern space to that of the hold space.
|
|---|
| 896 |
|
|---|
| 897 | `g'
|
|---|
| 898 | Replace the contents of the pattern space with the contents of the
|
|---|
| 899 | hold space.
|
|---|
| 900 |
|
|---|
| 901 | `G'
|
|---|
| 902 | Append a newline to the contents of the pattern space, and then
|
|---|
| 903 | append the contents of the hold space to that of the pattern space.
|
|---|
| 904 |
|
|---|
| 905 | `x'
|
|---|
| 906 | Exchange the contents of the hold and pattern spaces.
|
|---|
| 907 |
|
|---|
| 908 |
|
|---|
| 909 | ---------- Footnotes ----------
|
|---|
| 910 |
|
|---|
| 911 | (1) This is equivalent to `p' unless the `-i' option is being used.
|
|---|
| 912 |
|
|---|
| 913 |
|
|---|
| 914 | File: sed.info, Node: Programming Commands, Next: Extended Commands, Prev: Other Commands, Up: sed Programs
|
|---|
| 915 |
|
|---|
| 916 | Commands for `sed' gurus
|
|---|
| 917 | ========================
|
|---|
| 918 |
|
|---|
| 919 | In most cases, use of these commands indicates that you are probably
|
|---|
| 920 | better off programming in something like `awk' or Perl. But
|
|---|
| 921 | occasionally one is committed to sticking with `sed', and these
|
|---|
| 922 | commands can enable one to write quite convoluted scripts.
|
|---|
| 923 |
|
|---|
| 924 | `: LABEL'
|
|---|
| 925 | [No addresses allowed.]
|
|---|
| 926 |
|
|---|
| 927 | Specify the location of LABEL for branch commands. In all other
|
|---|
| 928 | respects, a no-op.
|
|---|
| 929 |
|
|---|
| 930 | `b LABEL'
|
|---|
| 931 | Unconditionally branch to LABEL. The LABEL may be omitted, in
|
|---|
| 932 | which case the next cycle is started.
|
|---|
| 933 |
|
|---|
| 934 | `t LABEL'
|
|---|
| 935 | Branch to LABEL only if there has been a successful `s'ubstitution
|
|---|
| 936 | since the last input line was read or conditional branch was taken.
|
|---|
| 937 | The LABEL may be omitted, in which case the next cycle is started.
|
|---|
| 938 |
|
|---|
| 939 |
|
|---|
| 940 |
|
|---|
| 941 | File: sed.info, Node: Extended Commands, Next: Escapes, Prev: Programming Commands, Up: sed Programs
|
|---|
| 942 |
|
|---|
| 943 | Commands Specific to GNU `sed'
|
|---|
| 944 | ==============================
|
|---|
| 945 |
|
|---|
| 946 | These commands are specific to GNU `sed', so you must use them with
|
|---|
| 947 | care and only when you are sure that hindering portability is not evil.
|
|---|
| 948 | They allow you to check for GNU `sed' extensions or to do tasks that
|
|---|
| 949 | are required quite often, yet are unsupported by standard `sed's.
|
|---|
| 950 |
|
|---|
| 951 | `e [COMMAND]'
|
|---|
| 952 | This command allows one to pipe input from a shell command into
|
|---|
| 953 | pattern space. Without parameters, the `e' command executes the
|
|---|
| 954 | command that is found in pattern space and replaces the pattern
|
|---|
| 955 | space with the output; a trailing newline is suppressed.
|
|---|
| 956 |
|
|---|
| 957 | If a parameter is specified, instead, the `e' command interprets
|
|---|
| 958 | it as a command and sends its output to the output stream (like
|
|---|
| 959 | `r' does). The command can run across multiple lines, all but the
|
|---|
| 960 | last ending with a back-slash.
|
|---|
| 961 |
|
|---|
| 962 | In both cases, the results are undefined if the command to be
|
|---|
| 963 | executed contains a NUL character.
|
|---|
| 964 |
|
|---|
| 965 | `L N'
|
|---|
| 966 | This GNU `sed' extension fills and joins lines in pattern space to
|
|---|
| 967 | produce output lines of (at most) N characters, like `fmt' does;
|
|---|
| 968 | if N is omitted, the default as specified on the command line is
|
|---|
| 969 | used. This command is considered a failed experiment and unless
|
|---|
| 970 | there is enough request (which seems unlikely) will be removed in
|
|---|
| 971 | future versions.
|
|---|
| 972 |
|
|---|
| 973 | `Q [EXIT-CODE]'
|
|---|
| 974 | This command only accepts a single address.
|
|---|
| 975 |
|
|---|
| 976 | This command is the same as `q', but will not print the contents
|
|---|
| 977 | of pattern space. Like `q', it provides the ability to return an
|
|---|
| 978 | exit code to the caller.
|
|---|
| 979 |
|
|---|
| 980 | This command can be useful because the only alternative ways to
|
|---|
| 981 | accomplish this apparently trivial function are to use the `-n'
|
|---|
| 982 | option (which can unnecessarily complicate your script) or
|
|---|
| 983 | resorting to the following snippet, which wastes time by reading
|
|---|
| 984 | the whole file without any visible effect:
|
|---|
| 985 |
|
|---|
| 986 | :eat
|
|---|
| 987 | $d Quit silently on the last line
|
|---|
| 988 | N Read another line, silently
|
|---|
| 989 | g Overwrite pattern space each time to save memory
|
|---|
| 990 | b eat
|
|---|
| 991 |
|
|---|
| 992 | `R FILENAME'
|
|---|
| 993 | Queue a line of FILENAME to be read and inserted into the output
|
|---|
| 994 | stream at the end of the current cycle, or when the next input
|
|---|
| 995 | line is read. Note that if FILENAME cannot be read, or if its end
|
|---|
| 996 | is reached, no line is appended, without any error indication.
|
|---|
| 997 |
|
|---|
| 998 | As with the `r' command, the special value `/dev/stdin' is
|
|---|
| 999 | supported for the file name, which reads a line from the standard
|
|---|
| 1000 | input.
|
|---|
| 1001 |
|
|---|
| 1002 | `T LABEL'
|
|---|
| 1003 | Branch to LABEL only if there have been no successful
|
|---|
| 1004 | `s'ubstitutions since the last input line was read or conditional
|
|---|
| 1005 | branch was taken. The LABEL may be omitted, in which case the next
|
|---|
| 1006 | cycle is started.
|
|---|
| 1007 |
|
|---|
| 1008 | `v VERSION'
|
|---|
| 1009 | This command does nothing, but makes `sed' fail if GNU `sed'
|
|---|
| 1010 | extensions are not supported, simply because other versions of
|
|---|
| 1011 | `sed' do not implement it. In addition, you can specify the
|
|---|
| 1012 | version of `sed' that your script requires, such as `4.0.5'. The
|
|---|
| 1013 | default is `4.0' because that is the first version that
|
|---|
| 1014 | implemented this command.
|
|---|
| 1015 |
|
|---|
| 1016 | This command enables all GNU extensions even if `POSIXLY_CORRECT'
|
|---|
| 1017 | is set in the environment.
|
|---|
| 1018 |
|
|---|
| 1019 | `W FILENAME'
|
|---|
| 1020 | Write to the given filename the portion of the pattern space up to
|
|---|
| 1021 | the first newline. Everything said under the `w' command about
|
|---|
| 1022 | file handling holds here too.
|
|---|
| 1023 |
|
|---|
| 1024 |
|
|---|
| 1025 | File: sed.info, Node: Escapes, Prev: Extended Commands, Up: sed Programs
|
|---|
| 1026 |
|
|---|
| 1027 | GNU Extensions for Escapes in Regular Expressions
|
|---|
| 1028 | =================================================
|
|---|
| 1029 |
|
|---|
| 1030 | Until this chapter, we have only encountered escapes of the form
|
|---|
| 1031 | `\^', which tell `sed' not to interpret the circumflex as a special
|
|---|
| 1032 | character, but rather to take it literally. For example, `\*' matches
|
|---|
| 1033 | a single asterisk rather than zero or more backslashes.
|
|---|
| 1034 |
|
|---|
| 1035 | This chapter introduces another kind of escape(1)--that is, escapes
|
|---|
| 1036 | that are applied to a character or sequence of characters that
|
|---|
| 1037 | ordinarily are taken literally, and that `sed' replaces with a special
|
|---|
| 1038 | character. This provides a way of encoding non-printable characters in
|
|---|
| 1039 | patterns in a visible manner. There is no restriction on the
|
|---|
| 1040 | appearance of non-printing characters in a `sed' script but when a
|
|---|
| 1041 | script is being prepared in the shell or by text editing, it is usually
|
|---|
| 1042 | easier to use one of the following escape sequences than the binary
|
|---|
| 1043 | character it represents:
|
|---|
| 1044 |
|
|---|
| 1045 | The list of these escapes is:
|
|---|
| 1046 |
|
|---|
| 1047 | `\a'
|
|---|
| 1048 | Produces or matches a BEL character, that is an "alert" (ASCII 7).
|
|---|
| 1049 |
|
|---|
| 1050 | `\f'
|
|---|
| 1051 | Produces or matches a form feed (ASCII 12).
|
|---|
| 1052 |
|
|---|
| 1053 | `\n'
|
|---|
| 1054 | Produces or matches a newline (ASCII 10).
|
|---|
| 1055 |
|
|---|
| 1056 | `\r'
|
|---|
| 1057 | Produces or matches a carriage return (ASCII 13).
|
|---|
| 1058 |
|
|---|
| 1059 | `\t'
|
|---|
| 1060 | Produces or matches a horizontal tab (ASCII 9).
|
|---|
| 1061 |
|
|---|
| 1062 | `\v'
|
|---|
| 1063 | Produces or matches a so called "vertical tab" (ASCII 11).
|
|---|
| 1064 |
|
|---|
| 1065 | `\cX'
|
|---|
| 1066 | Produces or matches `CONTROL-X', where X is any character. The
|
|---|
| 1067 | precise effect of `\cX' is as follows: if X is a lower case
|
|---|
| 1068 | letter, it is converted to upper case. Then bit 6 of the
|
|---|
| 1069 | character (hex 40) is inverted. Thus `\cz' becomes hex 1A, but
|
|---|
| 1070 | `\c{' becomes hex 3B, while `\c;' becomes hex 7B.
|
|---|
| 1071 |
|
|---|
| 1072 | `\dXXX'
|
|---|
| 1073 | Produces or matches a character whose decimal ASCII value is XXX.
|
|---|
| 1074 |
|
|---|
| 1075 | `\oXXX'
|
|---|
| 1076 | Produces or matches a character whose octal ASCII value is XXX.
|
|---|
| 1077 |
|
|---|
| 1078 | `\xXX'
|
|---|
| 1079 | Produces or matches a character whose hexadecimal ASCII value is
|
|---|
| 1080 | XX.
|
|---|
| 1081 |
|
|---|
| 1082 | `\b' (backspace) was omitted because of the conflict with the
|
|---|
| 1083 | existing "word boundary" meaning.
|
|---|
| 1084 |
|
|---|
| 1085 | Other escapes match a particular character class and are valid only
|
|---|
| 1086 | in regular expressions:
|
|---|
| 1087 |
|
|---|
| 1088 | `\w'
|
|---|
| 1089 | Matches any "word" character. A "word" character is any letter or
|
|---|
| 1090 | digit or the underscore character.
|
|---|
| 1091 |
|
|---|
| 1092 | `\W'
|
|---|
| 1093 | Matches any "non-word" character.
|
|---|
| 1094 |
|
|---|
| 1095 | `\b'
|
|---|
| 1096 | Matches a word boundary; that is it matches if the character to
|
|---|
| 1097 | the left is a "word" character and the character to the right is a
|
|---|
| 1098 | "non-word" character, or vice-versa.
|
|---|
| 1099 |
|
|---|
| 1100 | `\B'
|
|---|
| 1101 | Matches everywhere but on a word boundary; that is it matches if
|
|---|
| 1102 | the character to the left and the character to the right are
|
|---|
| 1103 | either both "word" characters or both "non-word" characters.
|
|---|
| 1104 |
|
|---|
| 1105 | `\`'
|
|---|
| 1106 | Matches only at the start of pattern space. This is different
|
|---|
| 1107 | from `^' in multi-line mode.
|
|---|
| 1108 |
|
|---|
| 1109 | `\''
|
|---|
| 1110 | Matches only at the end of pattern space. This is different from
|
|---|
| 1111 | `$' in multi-line mode.
|
|---|
| 1112 |
|
|---|
| 1113 |
|
|---|
| 1114 | ---------- Footnotes ----------
|
|---|
| 1115 |
|
|---|
| 1116 | (1) All the escapes introduced here are GNU extensions, with the
|
|---|
| 1117 | exception of `\n'. In basic regular expression mode, setting
|
|---|
| 1118 | `POSIXLY_CORRECT' disables them inside bracket expressions.
|
|---|
| 1119 |
|
|---|
| 1120 |
|
|---|
| 1121 | File: sed.info, Node: Examples, Next: Limitations, Prev: sed Programs, Up: Top
|
|---|
| 1122 |
|
|---|
| 1123 | Some Sample Scripts
|
|---|
| 1124 | *******************
|
|---|
| 1125 |
|
|---|
| 1126 | Here are some `sed' scripts to guide you in the art of mastering
|
|---|
| 1127 | `sed'.
|
|---|
| 1128 |
|
|---|
| 1129 | * Menu:
|
|---|
| 1130 |
|
|---|
| 1131 | Some exotic examples:
|
|---|
| 1132 | * Centering lines::
|
|---|
| 1133 | * Increment a number::
|
|---|
| 1134 | * Rename files to lower case::
|
|---|
| 1135 | * Print bash environment::
|
|---|
| 1136 | * Reverse chars of lines::
|
|---|
| 1137 |
|
|---|
| 1138 | Emulating standard utilities:
|
|---|
| 1139 | * tac:: Reverse lines of files
|
|---|
| 1140 | * cat -n:: Numbering lines
|
|---|
| 1141 | * cat -b:: Numbering non-blank lines
|
|---|
| 1142 | * wc -c:: Counting chars
|
|---|
| 1143 | * wc -w:: Counting words
|
|---|
| 1144 | * wc -l:: Counting lines
|
|---|
| 1145 | * head:: Printing the first lines
|
|---|
| 1146 | * tail:: Printing the last lines
|
|---|
| 1147 | * uniq:: Make duplicate lines unique
|
|---|
| 1148 | * uniq -d:: Print duplicated lines of input
|
|---|
| 1149 | * uniq -u:: Remove all duplicated lines
|
|---|
| 1150 | * cat -s:: Squeezing blank lines
|
|---|
| 1151 |
|
|---|
| 1152 |
|
|---|
| 1153 | File: sed.info, Node: Centering lines, Next: Increment a number, Up: Examples
|
|---|
| 1154 |
|
|---|
| 1155 | Centering Lines
|
|---|
| 1156 | ===============
|
|---|
| 1157 |
|
|---|
| 1158 | This script centers all lines of a file on a 80 columns width. To
|
|---|
| 1159 | change that width, the number in `\{...\}' must be replaced, and the
|
|---|
| 1160 | number of added spaces also must be changed.
|
|---|
| 1161 |
|
|---|
| 1162 | Note how the buffer commands are used to separate parts in the
|
|---|
| 1163 | regular expressions to be matched--this is a common technique.
|
|---|
| 1164 |
|
|---|
| 1165 | #!/usr/bin/sed -f
|
|---|
| 1166 |
|
|---|
| 1167 | # Put 80 spaces in the buffer
|
|---|
| 1168 | 1 {
|
|---|
| 1169 | x
|
|---|
| 1170 | s/^$/ /
|
|---|
| 1171 | s/^.*$/&&&&&&&&/
|
|---|
| 1172 | x
|
|---|
| 1173 | }
|
|---|
| 1174 |
|
|---|
| 1175 | # del leading and trailing spaces
|
|---|
| 1176 | y/tab/ /
|
|---|
| 1177 | s/^ *//
|
|---|
| 1178 | s/ *$//
|
|---|
| 1179 |
|
|---|
| 1180 | # add a newline and 80 spaces to end of line
|
|---|
| 1181 | G
|
|---|
| 1182 |
|
|---|
| 1183 | # keep first 81 chars (80 + a newline)
|
|---|
| 1184 | s/^\(.\{81\}\).*$/\1/
|
|---|
| 1185 |
|
|---|
| 1186 | # \2 matches half of the spaces, which are moved to the beginning
|
|---|
| 1187 | s/^\(.*\)\n\(.*\)\2/\2\1/
|
|---|
| 1188 |
|
|---|
| 1189 |
|
|---|
| 1190 | File: sed.info, Node: Increment a number, Next: Rename files to lower case, Prev: Centering lines, Up: Examples
|
|---|
| 1191 |
|
|---|
| 1192 | Increment a Number
|
|---|
| 1193 | ==================
|
|---|
| 1194 |
|
|---|
| 1195 | This script is one of a few that demonstrate how to do arithmetic in
|
|---|
| 1196 | `sed'. This is indeed possible,(1) but must be done manually.
|
|---|
| 1197 |
|
|---|
| 1198 | To increment one number you just add 1 to last digit, replacing it
|
|---|
| 1199 | by the following digit. There is one exception: when the digit is a
|
|---|
| 1200 | nine the previous digits must be also incremented until you don't have
|
|---|
| 1201 | a nine.
|
|---|
| 1202 |
|
|---|
| 1203 | This solution by Bruno Haible is very clever and smart because it
|
|---|
| 1204 | uses a single buffer; if you don't have this limitation, the algorithm
|
|---|
| 1205 | used in *Note Numbering lines: cat -n, is faster. It works by
|
|---|
| 1206 | replacing trailing nines with an underscore, then using multiple `s'
|
|---|
| 1207 | commands to increment the last digit, and then again substituting
|
|---|
| 1208 | underscores with zeros.
|
|---|
| 1209 |
|
|---|
| 1210 | #!/usr/bin/sed -f
|
|---|
| 1211 |
|
|---|
| 1212 | /[^0-9]/ d
|
|---|
| 1213 |
|
|---|
| 1214 | # replace all leading 9s by _ (any other character except digits, could
|
|---|
| 1215 | # be used)
|
|---|
| 1216 | :d
|
|---|
| 1217 | s/9\(_*\)$/_\1/
|
|---|
| 1218 | td
|
|---|
| 1219 |
|
|---|
| 1220 | # incr last digit only. The first line adds a most-significant
|
|---|
| 1221 | # digit of 1 if we have to add a digit.
|
|---|
| 1222 | #
|
|---|
| 1223 | # The `tn' commands are not necessary, but make the thing
|
|---|
| 1224 | # faster
|
|---|
| 1225 |
|
|---|
| 1226 | s/^\(_*\)$/1\1/; tn
|
|---|
| 1227 | s/8\(_*\)$/9\1/; tn
|
|---|
| 1228 | s/7\(_*\)$/8\1/; tn
|
|---|
| 1229 | s/6\(_*\)$/7\1/; tn
|
|---|
| 1230 | s/5\(_*\)$/6\1/; tn
|
|---|
| 1231 | s/4\(_*\)$/5\1/; tn
|
|---|
| 1232 | s/3\(_*\)$/4\1/; tn
|
|---|
| 1233 | s/2\(_*\)$/3\1/; tn
|
|---|
| 1234 | s/1\(_*\)$/2\1/; tn
|
|---|
| 1235 | s/0\(_*\)$/1\1/; tn
|
|---|
| 1236 |
|
|---|
| 1237 | :n
|
|---|
| 1238 | y/_/0/
|
|---|
| 1239 |
|
|---|
| 1240 | ---------- Footnotes ----------
|
|---|
| 1241 |
|
|---|
| 1242 | (1) `sed' guru Greg Ubben wrote an implementation of the `dc' RPN
|
|---|
| 1243 | calculator! It is distributed together with sed.
|
|---|
| 1244 |
|
|---|
| 1245 |
|
|---|
| 1246 | File: sed.info, Node: Rename files to lower case, Next: Print bash environment, Prev: Increment a number, Up: Examples
|
|---|
| 1247 |
|
|---|
| 1248 | Rename Files to Lower Case
|
|---|
| 1249 | ==========================
|
|---|
| 1250 |
|
|---|
| 1251 | This is a pretty strange use of `sed'. We transform text, and
|
|---|
| 1252 | transform it to be shell commands, then just feed them to shell. Don't
|
|---|
| 1253 | worry, even worse hacks are done when using `sed'; I have seen a script
|
|---|
| 1254 | converting the output of `date' into a `bc' program!
|
|---|
| 1255 |
|
|---|
| 1256 | The main body of this is the `sed' script, which remaps the name
|
|---|
| 1257 | from lower to upper (or vice-versa) and even checks out if the remapped
|
|---|
| 1258 | name is the same as the original name. Note how the script is
|
|---|
| 1259 | parameterized using shell variables and proper quoting.
|
|---|
| 1260 |
|
|---|
| 1261 | #! /bin/sh
|
|---|
| 1262 | # rename files to lower/upper case...
|
|---|
| 1263 | #
|
|---|
| 1264 | # usage:
|
|---|
| 1265 | # move-to-lower *
|
|---|
| 1266 | # move-to-upper *
|
|---|
| 1267 | # or
|
|---|
| 1268 | # move-to-lower -R .
|
|---|
| 1269 | # move-to-upper -R .
|
|---|
| 1270 | #
|
|---|
| 1271 |
|
|---|
| 1272 | help()
|
|---|
| 1273 | {
|
|---|
| 1274 | cat << eof
|
|---|
| 1275 | Usage: $0 [-n] [-r] [-h] files...
|
|---|
| 1276 |
|
|---|
| 1277 | -n do nothing, only see what would be done
|
|---|
| 1278 | -R recursive (use find)
|
|---|
| 1279 | -h this message
|
|---|
| 1280 | files files to remap to lower case
|
|---|
| 1281 |
|
|---|
| 1282 | Examples:
|
|---|
| 1283 | $0 -n * (see if everything is ok, then...)
|
|---|
| 1284 | $0 *
|
|---|
| 1285 |
|
|---|
| 1286 | $0 -R .
|
|---|
| 1287 |
|
|---|
| 1288 | eof
|
|---|
| 1289 | }
|
|---|
| 1290 |
|
|---|
| 1291 | apply_cmd='sh'
|
|---|
| 1292 | finder='echo "$@" | tr " " "\n"'
|
|---|
| 1293 | files_only=
|
|---|
| 1294 |
|
|---|
| 1295 | while :
|
|---|
| 1296 | do
|
|---|
| 1297 | case "$1" in
|
|---|
| 1298 | -n) apply_cmd='cat' ;;
|
|---|
| 1299 | -R) finder='find "$@" -type f';;
|
|---|
| 1300 | -h) help ; exit 1 ;;
|
|---|
| 1301 | *) break ;;
|
|---|
| 1302 | esac
|
|---|
| 1303 | shift
|
|---|
| 1304 | done
|
|---|
| 1305 |
|
|---|
| 1306 | if [ -z "$1" ]; then
|
|---|
| 1307 | echo Usage: $0 [-h] [-n] [-r] files...
|
|---|
| 1308 | exit 1
|
|---|
| 1309 | fi
|
|---|
| 1310 |
|
|---|
| 1311 | LOWER='abcdefghijklmnopqrstuvwxyz'
|
|---|
| 1312 | UPPER='ABCDEFGHIJKLMNOPQRSTUVWXYZ'
|
|---|
| 1313 |
|
|---|
| 1314 | case `basename $0` in
|
|---|
| 1315 | *upper*) TO=$UPPER; FROM=$LOWER ;;
|
|---|
| 1316 | *) FROM=$UPPER; TO=$LOWER ;;
|
|---|
| 1317 | esac
|
|---|
| 1318 |
|
|---|
| 1319 | eval $finder | sed -n '
|
|---|
| 1320 |
|
|---|
| 1321 | # remove all trailing slashes
|
|---|
| 1322 | s/\/*$//
|
|---|
| 1323 |
|
|---|
| 1324 | # add ./ if there is no path, only a filename
|
|---|
| 1325 | /\//! s/^/.\//
|
|---|
| 1326 |
|
|---|
| 1327 | # save path+filename
|
|---|
| 1328 | h
|
|---|
| 1329 |
|
|---|
| 1330 | # remove path
|
|---|
| 1331 | s/.*\///
|
|---|
| 1332 |
|
|---|
| 1333 | # do conversion only on filename
|
|---|
| 1334 | y/'$FROM'/'$TO'/
|
|---|
| 1335 |
|
|---|
| 1336 | # now line contains original path+file, while
|
|---|
| 1337 | # hold space contains the new filename
|
|---|
| 1338 | x
|
|---|
| 1339 |
|
|---|
| 1340 | # add converted file name to line, which now contains
|
|---|
| 1341 | # path/file-name\nconverted-file-name
|
|---|
| 1342 | G
|
|---|
| 1343 |
|
|---|
| 1344 | # check if converted file name is equal to original file name,
|
|---|
| 1345 | # if it is, do not print nothing
|
|---|
| 1346 | /^.*\/\(.*\)\n\1/b
|
|---|
| 1347 |
|
|---|
| 1348 | # now, transform path/fromfile\n, into
|
|---|
| 1349 | # mv path/fromfile path/tofile and print it
|
|---|
| 1350 | s/^\(.*\/\)\(.*\)\n\(.*\)$/mv "\1\2" "\1\3"/p
|
|---|
| 1351 |
|
|---|
| 1352 | ' | $apply_cmd
|
|---|
| 1353 |
|
|---|