| 1 | This is ../../doc/sed.info, produced by makeinfo version 4.5 from | 
|---|
| 2 | ../../doc/sed.texi. | 
|---|
| 3 |  | 
|---|
| 4 | INFO-DIR-SECTION Text creation and manipulation | 
|---|
| 5 | START-INFO-DIR-ENTRY | 
|---|
| 6 | * sed: (sed).                   Stream EDitor. | 
|---|
| 7 |  | 
|---|
| 8 | END-INFO-DIR-ENTRY | 
|---|
| 9 |  | 
|---|
| 10 | This file documents version 4.1.5 of GNU `sed', a stream editor. | 
|---|
| 11 |  | 
|---|
| 12 | Copyright (C) 1998, 1999, 2001, 2002, 2003, 2004 Free Software | 
|---|
| 13 | Foundation, Inc. | 
|---|
| 14 |  | 
|---|
| 15 | This document is released under the terms of the GNU Free | 
|---|
| 16 | Documentation License as published by the Free Software Foundation; | 
|---|
| 17 | either version 1.1, or (at your option) any later version. | 
|---|
| 18 |  | 
|---|
| 19 | You should have received a copy of the GNU Free Documentation | 
|---|
| 20 | License along with GNU `sed'; see the file `COPYING.DOC'.  If not, | 
|---|
| 21 | write to the Free Software Foundation, 59 Temple Place - Suite 330, | 
|---|
| 22 | Boston, MA 02110-1301, USA. | 
|---|
| 23 |  | 
|---|
| 24 | There are no Cover Texts and no Invariant Sections; this text, along | 
|---|
| 25 | with its equivalent in the printed manual, constitutes the Title Page. | 
|---|
| 26 |  | 
|---|
| 27 | File: sed.info,  Node: Top,  Next: Introduction,  Up: (dir) | 
|---|
| 28 |  | 
|---|
| 29 |  | 
|---|
| 30 |  | 
|---|
| 31 | This file documents version 4.1.5 of GNU `sed', a stream editor. | 
|---|
| 32 |  | 
|---|
| 33 | Copyright (C) 1998, 1999, 2001, 2002, 2003, 2004 Free Software | 
|---|
| 34 | Foundation, Inc. | 
|---|
| 35 |  | 
|---|
| 36 | This document is released under the terms of the GNU Free | 
|---|
| 37 | Documentation License as published by the Free Software Foundation; | 
|---|
| 38 | either version 1.1, or (at your option) any later version. | 
|---|
| 39 |  | 
|---|
| 40 | You should have received a copy of the GNU Free Documentation | 
|---|
| 41 | License along with GNU `sed'; see the file `COPYING.DOC'.  If not, | 
|---|
| 42 | write to the Free Software Foundation, 59 Temple Place - Suite 330, | 
|---|
| 43 | Boston, MA 02110-1301, USA. | 
|---|
| 44 |  | 
|---|
| 45 | There are no Cover Texts and no Invariant Sections; this text, along | 
|---|
| 46 | with its equivalent in the printed manual, constitutes the Title Page. | 
|---|
| 47 | * Menu: | 
|---|
| 48 |  | 
|---|
| 49 | * Introduction::               Introduction | 
|---|
| 50 | * Invoking sed::               Invocation | 
|---|
| 51 | * sed Programs::               `sed' programs | 
|---|
| 52 | * Examples::                   Some sample scripts | 
|---|
| 53 | * Limitations::                Limitations and (non-)limitations of GNU `sed' | 
|---|
| 54 | * Other Resources::            Other resources for learning about `sed' | 
|---|
| 55 | * Reporting Bugs::             Reporting bugs | 
|---|
| 56 |  | 
|---|
| 57 | * Extended regexps::           `egrep'-style regular expressions | 
|---|
| 58 |  | 
|---|
| 59 | * Concept Index::              A menu with all the topics in this manual. | 
|---|
| 60 | * Command and Option Index::   A menu with all `sed' commands and | 
|---|
| 61 | command-line options. | 
|---|
| 62 |  | 
|---|
| 63 | --- The detailed node listing --- | 
|---|
| 64 |  | 
|---|
| 65 | sed Programs: | 
|---|
| 66 | * Execution Cycle::                 How `sed' works | 
|---|
| 67 | * Addresses::                       Selecting lines with `sed' | 
|---|
| 68 | * Regular Expressions::             Overview of regular expression syntax | 
|---|
| 69 | * Common Commands::                 Often used commands | 
|---|
| 70 | * The "s" Command::                 `sed''s Swiss Army Knife | 
|---|
| 71 | * Other Commands::                  Less frequently used commands | 
|---|
| 72 | * Programming Commands::            Commands for `sed' gurus | 
|---|
| 73 | * Extended Commands::               Commands specific of GNU `sed' | 
|---|
| 74 | * Escapes::                         Specifying special characters | 
|---|
| 75 |  | 
|---|
| 76 | Examples: | 
|---|
| 77 | * Centering lines:: | 
|---|
| 78 | * Increment a number:: | 
|---|
| 79 | * Rename files to lower case:: | 
|---|
| 80 | * Print bash environment:: | 
|---|
| 81 | * Reverse chars of lines:: | 
|---|
| 82 | * tac::                             Reverse lines of files | 
|---|
| 83 | * cat -n::                          Numbering lines | 
|---|
| 84 | * cat -b::                          Numbering non-blank lines | 
|---|
| 85 | * wc -c::                           Counting chars | 
|---|
| 86 | * wc -w::                           Counting words | 
|---|
| 87 | * wc -l::                           Counting lines | 
|---|
| 88 | * head::                            Printing the first lines | 
|---|
| 89 | * tail::                            Printing the last lines | 
|---|
| 90 | * uniq::                            Make duplicate lines unique | 
|---|
| 91 | * uniq -d::                         Print duplicated lines of input | 
|---|
| 92 | * uniq -u::                         Remove all duplicated lines | 
|---|
| 93 | * cat -s::                          Squeezing blank lines | 
|---|
| 94 |  | 
|---|
| 95 |  | 
|---|
| 96 | File: sed.info,  Node: Introduction,  Next: Invoking sed,  Prev: Top,  Up: Top | 
|---|
| 97 |  | 
|---|
| 98 | Introduction | 
|---|
| 99 | ************ | 
|---|
| 100 |  | 
|---|
| 101 | `sed' is a stream editor.  A stream editor is used to perform basic | 
|---|
| 102 | text transformations on an input stream (a file or input from a | 
|---|
| 103 | pipeline).  While in some ways similar to an editor which permits | 
|---|
| 104 | scripted edits (such as `ed'), `sed' works by making only one pass over | 
|---|
| 105 | the input(s), and is consequently more efficient.  But it is `sed''s | 
|---|
| 106 | ability to filter text in a pipeline which particularly distinguishes | 
|---|
| 107 | it from other types of editors. | 
|---|
| 108 |  | 
|---|
| 109 |  | 
|---|
| 110 | File: sed.info,  Node: Invoking sed,  Next: sed Programs,  Prev: Introduction,  Up: Top | 
|---|
| 111 |  | 
|---|
| 112 | Invocation | 
|---|
| 113 | ********** | 
|---|
| 114 |  | 
|---|
| 115 | Normally `sed' is invoked like this: | 
|---|
| 116 |  | 
|---|
| 117 | sed SCRIPT INPUTFILE... | 
|---|
| 118 |  | 
|---|
| 119 | The full format for invoking `sed' is: | 
|---|
| 120 |  | 
|---|
| 121 | sed OPTIONS... [SCRIPT] [INPUTFILE...] | 
|---|
| 122 |  | 
|---|
| 123 | If you do not specify INPUTFILE, or if INPUTFILE is `-', `sed' | 
|---|
| 124 | filters the contents of the standard input.  The SCRIPT is actually the | 
|---|
| 125 | first non-option parameter, which `sed' specially considers a script | 
|---|
| 126 | and not an input file if (and only if) none of the other OPTIONS | 
|---|
| 127 | specifies a script to be executed, that is if neither of the `-e' and | 
|---|
| 128 | `-f' options is specified. | 
|---|
| 129 |  | 
|---|
| 130 | `sed' may be invoked with the following command-line options: | 
|---|
| 131 |  | 
|---|
| 132 | `--version' | 
|---|
| 133 | Print out the version of `sed' that is being run and a copyright | 
|---|
| 134 | notice, then exit. | 
|---|
| 135 |  | 
|---|
| 136 | `--help' | 
|---|
| 137 | Print a usage message briefly summarizing these command-line | 
|---|
| 138 | options and the bug-reporting address, then exit. | 
|---|
| 139 |  | 
|---|
| 140 | `-n' | 
|---|
| 141 | `--quiet' | 
|---|
| 142 | `--silent' | 
|---|
| 143 | By default, `sed' prints out the pattern space at the end of each | 
|---|
| 144 | cycle through the script.  These options disable this automatic | 
|---|
| 145 | printing, and `sed' only produces output when explicitly told to | 
|---|
| 146 | via the `p' command. | 
|---|
| 147 |  | 
|---|
| 148 | `-i[SUFFIX]' | 
|---|
| 149 | `--in-place[=SUFFIX]' | 
|---|
| 150 | This option specifies that files are to be edited in-place.  GNU | 
|---|
| 151 | `sed' does this by creating a temporary file and sending output to | 
|---|
| 152 | this file rather than to the standard output.(1). | 
|---|
| 153 |  | 
|---|
| 154 | This option implies `-s'. | 
|---|
| 155 |  | 
|---|
| 156 | When the end of the file is reached, the temporary file is renamed | 
|---|
| 157 | to the output file's original name.  The extension, if supplied, | 
|---|
| 158 | is used to modify the name of the old file before renaming the | 
|---|
| 159 | temporary file, thereby making a backup copy(2)). | 
|---|
| 160 |  | 
|---|
| 161 | This rule is followed: if the extension doesn't contain a `*', | 
|---|
| 162 | then it is appended to the end of the current filename as a | 
|---|
| 163 | suffix; if the extension does contain one or more `*' characters, | 
|---|
| 164 | then _each_ asterisk is replaced with the current filename.  This | 
|---|
| 165 | allows you to add a prefix to the backup file, instead of (or in | 
|---|
| 166 | addition to) a suffix, or even to place backup copies of the | 
|---|
| 167 | original files into another directory (provided the directory | 
|---|
| 168 | already exists). | 
|---|
| 169 |  | 
|---|
| 170 | If no extension is supplied, the original file is overwritten | 
|---|
| 171 | without making a backup. | 
|---|
| 172 |  | 
|---|
| 173 | `-l N' | 
|---|
| 174 | `--line-length=N' | 
|---|
| 175 | Specify the default line-wrap length for the `l' command.  A | 
|---|
| 176 | length of 0 (zero) means to never wrap long lines.  If not | 
|---|
| 177 | specified, it is taken to be 70. | 
|---|
| 178 |  | 
|---|
| 179 | `--posix' | 
|---|
| 180 | GNU `sed' includes several extensions to POSIX sed.  In order to | 
|---|
| 181 | simplify writing portable scripts, this option disables all the | 
|---|
| 182 | extensions that this manual documents, including additional | 
|---|
| 183 | commands.  Most of the extensions accept `sed' programs that are | 
|---|
| 184 | outside the syntax mandated by POSIX, but some of them (such as | 
|---|
| 185 | the behavior of the `N' command described in *note Reporting | 
|---|
| 186 | Bugs::) actually violate the standard.  If you want to disable | 
|---|
| 187 | only the latter kind of extension, you can set the | 
|---|
| 188 | `POSIXLY_CORRECT' variable to a non-empty value. | 
|---|
| 189 |  | 
|---|
| 190 | `-r' | 
|---|
| 191 | `--regexp-extended' | 
|---|
| 192 | Use extended regular expressions rather than basic regular | 
|---|
| 193 | expressions.  Extended regexps are those that `egrep' accepts; | 
|---|
| 194 | they can be clearer because they usually have less backslashes, | 
|---|
| 195 | but are a GNU extension and hence scripts that use them are not | 
|---|
| 196 | portable.  *Note Extended regular expressions: Extended regexps. | 
|---|
| 197 |  | 
|---|
| 198 | `-s' | 
|---|
| 199 | `--separate' | 
|---|
| 200 | By default, `sed' will consider the files specified on the command | 
|---|
| 201 | line as a single continuous long stream.  This GNU `sed' extension | 
|---|
| 202 | allows the user to consider them as separate files: range | 
|---|
| 203 | addresses (such as `/abc/,/def/') are not allowed to span several | 
|---|
| 204 | files, line numbers are relative to the start of each file, `$' | 
|---|
| 205 | refers to the last line of each file, and files invoked from the | 
|---|
| 206 | `R' commands are rewound at the start of each file. | 
|---|
| 207 |  | 
|---|
| 208 | `-u' | 
|---|
| 209 | `--unbuffered' | 
|---|
| 210 | Buffer both input and output as minimally as practical.  (This is | 
|---|
| 211 | particularly useful if the input is coming from the likes of `tail | 
|---|
| 212 | -f', and you wish to see the transformed output as soon as | 
|---|
| 213 | possible.) | 
|---|
| 214 |  | 
|---|
| 215 | `-e SCRIPT' | 
|---|
| 216 | `--expression=SCRIPT' | 
|---|
| 217 | Add the commands in SCRIPT to the set of commands to be run while | 
|---|
| 218 | processing the input. | 
|---|
| 219 |  | 
|---|
| 220 | `-f SCRIPT-FILE' | 
|---|
| 221 | `--file=SCRIPT-FILE' | 
|---|
| 222 | Add the commands contained in the file SCRIPT-FILE to the set of | 
|---|
| 223 | commands to be run while processing the input. | 
|---|
| 224 |  | 
|---|
| 225 |  | 
|---|
| 226 | If no `-e', `-f', `--expression', or `--file' options are given on | 
|---|
| 227 | the command-line, then the first non-option argument on the command | 
|---|
| 228 | line is taken to be the SCRIPT to be executed. | 
|---|
| 229 |  | 
|---|
| 230 | If any command-line parameters remain after processing the above, | 
|---|
| 231 | these parameters are interpreted as the names of input files to be | 
|---|
| 232 | processed.  A file name of `-' refers to the standard input stream. | 
|---|
| 233 | The standard input will be processed if no file names are specified. | 
|---|
| 234 |  | 
|---|
| 235 | ---------- Footnotes ---------- | 
|---|
| 236 |  | 
|---|
| 237 | (1) This applies to commands such as `=', `a', `c', `i', `l', `p'. | 
|---|
| 238 | You can still write to the standard output by using the `w' or `W' | 
|---|
| 239 | commands together with the `/dev/stdout' special file | 
|---|
| 240 |  | 
|---|
| 241 | (2) Note that GNU `sed' creates the backup     file whether or not | 
|---|
| 242 | any output is actually changed. | 
|---|
| 243 |  | 
|---|
| 244 |  | 
|---|
| 245 | File: sed.info,  Node: sed Programs,  Next: Examples,  Prev: Invoking sed,  Up: Top | 
|---|
| 246 |  | 
|---|
| 247 | `sed' Programs | 
|---|
| 248 | ************** | 
|---|
| 249 |  | 
|---|
| 250 | A `sed' program consists of one or more `sed' commands, passed in by | 
|---|
| 251 | one or more of the `-e', `-f', `--expression', and `--file' options, or | 
|---|
| 252 | the first non-option argument if zero of these options are used.  This | 
|---|
| 253 | document will refer to "the" `sed' script; this is understood to mean | 
|---|
| 254 | the in-order catenation of all of the SCRIPTs and SCRIPT-FILEs passed | 
|---|
| 255 | in. | 
|---|
| 256 |  | 
|---|
| 257 | Each `sed' command consists of an optional address or address range, | 
|---|
| 258 | followed by a one-character command name and any additional | 
|---|
| 259 | command-specific code. | 
|---|
| 260 |  | 
|---|
| 261 | * Menu: | 
|---|
| 262 |  | 
|---|
| 263 | * Execution Cycle::          How `sed' works | 
|---|
| 264 | * Addresses::                Selecting lines with `sed' | 
|---|
| 265 | * Regular Expressions::      Overview of regular expression syntax | 
|---|
| 266 | * Common Commands::          Often used commands | 
|---|
| 267 | * The "s" Command::          `sed''s Swiss Army Knife | 
|---|
| 268 | * Other Commands::           Less frequently used commands | 
|---|
| 269 | * Programming Commands::     Commands for `sed' gurus | 
|---|
| 270 | * Extended Commands::        Commands specific of GNU `sed' | 
|---|
| 271 | * Escapes::                  Specifying special characters | 
|---|
| 272 |  | 
|---|
| 273 |  | 
|---|
| 274 | File: sed.info,  Node: Execution Cycle,  Next: Addresses,  Up: sed Programs | 
|---|
| 275 |  | 
|---|
| 276 | How `sed' Works | 
|---|
| 277 | =============== | 
|---|
| 278 |  | 
|---|
| 279 | `sed' maintains two data buffers: the active _pattern_ space, and | 
|---|
| 280 | the auxiliary _hold_ space. Both are initially empty. | 
|---|
| 281 |  | 
|---|
| 282 | `sed' operates by performing the following cycle on each lines of | 
|---|
| 283 | input: first, `sed' reads one line from the input stream, removes any | 
|---|
| 284 | trailing newline, and places it in the pattern space.  Then commands | 
|---|
| 285 | are executed; each command can have an address associated to it: | 
|---|
| 286 | addresses are a kind of condition code, and a command is only executed | 
|---|
| 287 | if the condition is verified before the command is to be executed. | 
|---|
| 288 |  | 
|---|
| 289 | When the end of the script is reached, unless the `-n' option is in | 
|---|
| 290 | use, the contents of pattern space are printed out to the output | 
|---|
| 291 | stream, adding back the trailing newline if it was removed.(1) Then the | 
|---|
| 292 | next cycle starts for the next input line. | 
|---|
| 293 |  | 
|---|
| 294 | Unless special commands (like `D') are used, the pattern space is | 
|---|
| 295 | deleted between two cycles. The hold space, on the other hand, keeps | 
|---|
| 296 | its data between cycles (see commands `h', `H', `x', `g', `G' to move | 
|---|
| 297 | data between both buffers). | 
|---|
| 298 |  | 
|---|
| 299 | ---------- Footnotes ---------- | 
|---|
| 300 |  | 
|---|
| 301 | (1) Actually,   if `sed' prints a line without the terminating | 
|---|
| 302 | newline, it will   nevertheless print the missing newline as soon as | 
|---|
| 303 | more text is sent to   the same output stream, which gives the "least | 
|---|
| 304 | expected surprise"   even though it does not make commands like `sed -n | 
|---|
| 305 | p' exactly   identical to `cat'. | 
|---|
| 306 |  | 
|---|
| 307 |  | 
|---|
| 308 | File: sed.info,  Node: Addresses,  Next: Regular Expressions,  Prev: Execution Cycle,  Up: sed Programs | 
|---|
| 309 |  | 
|---|
| 310 | Selecting lines with `sed' | 
|---|
| 311 | ========================== | 
|---|
| 312 |  | 
|---|
| 313 | Addresses in a `sed' script can be in any of the following forms: | 
|---|
| 314 | `NUMBER' | 
|---|
| 315 | Specifying a line number will match only that line in the input. | 
|---|
| 316 | (Note that `sed' counts lines continuously across all input files | 
|---|
| 317 | unless `-i' or `-s' options are specified.) | 
|---|
| 318 |  | 
|---|
| 319 | `FIRST~STEP' | 
|---|
| 320 | This GNU extension matches every STEPth line starting with line | 
|---|
| 321 | FIRST.  In particular, lines will be selected when there exists a | 
|---|
| 322 | non-negative N such that the current line-number equals FIRST + (N | 
|---|
| 323 | * STEP).  Thus, to select the odd-numbered lines, one would use | 
|---|
| 324 | `1~2'; to pick every third line starting with the second, `2~3' | 
|---|
| 325 | would be used; to pick every fifth line starting with the tenth, | 
|---|
| 326 | use `10~5'; and `50~0' is just an obscure way of saying `50'. | 
|---|
| 327 |  | 
|---|
| 328 | `$' | 
|---|
| 329 | This address matches the last line of the last file of input, or | 
|---|
| 330 | the last line of each file when the `-i' or `-s' options are | 
|---|
| 331 | specified. | 
|---|
| 332 |  | 
|---|
| 333 | `/REGEXP/' | 
|---|
| 334 | This will select any line which matches the regular expression | 
|---|
| 335 | REGEXP.  If REGEXP itself includes any `/' characters, each must | 
|---|
| 336 | be escaped by a backslash (`\'). | 
|---|
| 337 |  | 
|---|
| 338 | The empty regular expression `//' repeats the last regular | 
|---|
| 339 | expression match (the same holds if the empty regular expression is | 
|---|
| 340 | passed to the `s' command).  Note that modifiers to regular | 
|---|
| 341 | expressions are evaluated when the regular expression is compiled, | 
|---|
| 342 | thus it is invalid to specify them together with the empty regular | 
|---|
| 343 | expression. | 
|---|
| 344 |  | 
|---|
| 345 | `\%REGEXP%' | 
|---|
| 346 | (The `%' may be replaced by any other single character.) | 
|---|
| 347 |  | 
|---|
| 348 | This also matches the regular expression REGEXP, but allows one to | 
|---|
| 349 | use a different delimiter than `/'.  This is particularly useful | 
|---|
| 350 | if the REGEXP itself contains a lot of slashes, since it avoids | 
|---|
| 351 | the tedious escaping of every `/'.  If REGEXP itself includes any | 
|---|
| 352 | delimiter characters, each must be escaped by a backslash (`\'). | 
|---|
| 353 |  | 
|---|
| 354 | `/REGEXP/I' | 
|---|
| 355 | `\%REGEXP%I' | 
|---|
| 356 | The `I' modifier to regular-expression matching is a GNU extension | 
|---|
| 357 | which causes the REGEXP to be matched in a case-insensitive manner. | 
|---|
| 358 |  | 
|---|
| 359 | `/REGEXP/M' | 
|---|
| 360 | `\%REGEXP%M' | 
|---|
| 361 | The `M' modifier to regular-expression matching is a GNU `sed' | 
|---|
| 362 | extension which causes `^' and `$' to match respectively (in | 
|---|
| 363 | addition to the normal behavior) the empty string after a newline, | 
|---|
| 364 | and the empty string before a newline.  There are special character | 
|---|
| 365 | sequences (`\`' and `\'') which always match the beginning or the | 
|---|
| 366 | end of the buffer.  `M' stands for `multi-line'. | 
|---|
| 367 |  | 
|---|
| 368 |  | 
|---|
| 369 | If no addresses are given, then all lines are matched; if one | 
|---|
| 370 | address is given, then only lines matching that address are matched. | 
|---|
| 371 |  | 
|---|
| 372 | An address range can be specified by specifying two addresses | 
|---|
| 373 | separated by a comma (`,').  An address range matches lines starting | 
|---|
| 374 | from where the first address matches, and continues until the second | 
|---|
| 375 | address matches (inclusively). | 
|---|
| 376 |  | 
|---|
| 377 | If the second address is a REGEXP, then checking for the ending | 
|---|
| 378 | match will start with the line _following_ the line which matched the | 
|---|
| 379 | first address: a range will always span at least two lines (except of | 
|---|
| 380 | course if the input stream ends). | 
|---|
| 381 |  | 
|---|
| 382 | If the second address is a NUMBER less than (or equal to) the line | 
|---|
| 383 | matching the first address, then only the one line is matched. | 
|---|
| 384 |  | 
|---|
| 385 | GNU `sed' also supports some special two-address forms; all these | 
|---|
| 386 | are GNU extensions: | 
|---|
| 387 | `0,/REGEXP/' | 
|---|
| 388 | A line number of `0' can be used in an address specification like | 
|---|
| 389 | `0,/REGEXP/' so that `sed' will try to match REGEXP in the first | 
|---|
| 390 | input line too.  In other words, `0,/REGEXP/' is similar to | 
|---|
| 391 | `1,/REGEXP/', except that if ADDR2 matches the very first line of | 
|---|
| 392 | input the `0,/REGEXP/' form will consider it to end the range, | 
|---|
| 393 | whereas the `1,/REGEXP/' form will match the beginning of its | 
|---|
| 394 | range and hence make the range span up to the _second_ occurrence | 
|---|
| 395 | of the regular expression. | 
|---|
| 396 |  | 
|---|
| 397 | Note that this is the only place where the `0' address makes | 
|---|
| 398 | sense; there is no 0-th line and commands which are given the `0' | 
|---|
| 399 | address in any other way will give an error. | 
|---|
| 400 |  | 
|---|
| 401 | `ADDR1,+N' | 
|---|
| 402 | Matches ADDR1 and the N lines following ADDR1. | 
|---|
| 403 |  | 
|---|
| 404 | `ADDR1,~N' | 
|---|
| 405 | Matches ADDR1 and the lines following ADDR1 until the next line | 
|---|
| 406 | whose input line number is a multiple of N. | 
|---|
| 407 |  | 
|---|
| 408 | Appending the `!' character to the end of an address specification | 
|---|
| 409 | negates the sense of the match.  That is, if the `!' character follows | 
|---|
| 410 | an address range, then only lines which do _not_ match the address range | 
|---|
| 411 | will be selected.  This also works for singleton addresses, and, | 
|---|
| 412 | perhaps perversely, for the null address. | 
|---|
| 413 |  | 
|---|
| 414 |  | 
|---|
| 415 | File: sed.info,  Node: Regular Expressions,  Next: Common Commands,  Prev: Addresses,  Up: sed Programs | 
|---|
| 416 |  | 
|---|
| 417 | Overview of Regular Expression Syntax | 
|---|
| 418 | ===================================== | 
|---|
| 419 |  | 
|---|
| 420 | To know how to use `sed', people should understand regular | 
|---|
| 421 | expressions ("regexp" for short).  A regular expression is a pattern | 
|---|
| 422 | that is matched against a subject string from left to right.  Most | 
|---|
| 423 | characters are "ordinary": they stand for themselves in a pattern, and | 
|---|
| 424 | match the corresponding characters in the subject.  As a trivial | 
|---|
| 425 | example, the pattern | 
|---|
| 426 |  | 
|---|
| 427 | The quick brown fox | 
|---|
| 428 |  | 
|---|
| 429 | matches a portion of a subject string that is identical to itself.  The | 
|---|
| 430 | power of regular expressions comes from the ability to include | 
|---|
| 431 | alternatives and repetitions in the pattern.  These are encoded in the | 
|---|
| 432 | pattern by the use of "special characters", which do not stand for | 
|---|
| 433 | themselves but instead are interpreted in some special way.  Here is a | 
|---|
| 434 | brief description of regular expression syntax as used in `sed'. | 
|---|
| 435 |  | 
|---|
| 436 | `CHAR' | 
|---|
| 437 | A single ordinary character matches itself. | 
|---|
| 438 |  | 
|---|
| 439 | `*' | 
|---|
| 440 | Matches a sequence of zero or more instances of matches for the | 
|---|
| 441 | preceding regular expression, which must be an ordinary character, | 
|---|
| 442 | a special character preceded by `\', a `.', a grouped regexp (see | 
|---|
| 443 | below), or a bracket expression.  As a GNU extension, a postfixed | 
|---|
| 444 | regular expression can also be followed by `*'; for example, `a**' | 
|---|
| 445 | is equivalent to `a*'.  POSIX 1003.1-2001 says that `*' stands for | 
|---|
| 446 | itself when it appears at the start of a regular expression or | 
|---|
| 447 | subexpression, but many nonGNU implementations do not support this | 
|---|
| 448 | and portable scripts should instead use `\*' in these contexts. | 
|---|
| 449 |  | 
|---|
| 450 | `\+' | 
|---|
| 451 | As `*', but matches one or more.  It is a GNU extension. | 
|---|
| 452 |  | 
|---|
| 453 | `\?' | 
|---|
| 454 | As `*', but only matches zero or one.  It is a GNU extension. | 
|---|
| 455 |  | 
|---|
| 456 | `\{I\}' | 
|---|
| 457 | As `*', but matches exactly I sequences (I is a decimal integer; | 
|---|
| 458 | for portability, keep it between 0 and 255 inclusive). | 
|---|
| 459 |  | 
|---|
| 460 | `\{I,J\}' | 
|---|
| 461 | Matches between I and J, inclusive, sequences. | 
|---|
| 462 |  | 
|---|
| 463 | `\{I,\}' | 
|---|
| 464 | Matches more than or equal to I sequences. | 
|---|
| 465 |  | 
|---|
| 466 | `\(REGEXP\)' | 
|---|
| 467 | Groups the inner REGEXP as a whole, this is used to: | 
|---|
| 468 |  | 
|---|
| 469 | * Apply postfix operators, like `\(abcd\)*': this will search | 
|---|
| 470 | for zero or more whole sequences of `abcd', while `abcd*' | 
|---|
| 471 | would search for `abc' followed by zero or more occurrences | 
|---|
| 472 | of `d'.  Note that support for `\(abcd\)*' is required by | 
|---|
| 473 | POSIX 1003.1-2001, but many non-GNU implementations do not | 
|---|
| 474 | support it and hence it is not universally portable. | 
|---|
| 475 |  | 
|---|
| 476 | * Use back references (see below). | 
|---|
| 477 |  | 
|---|
| 478 | `.' | 
|---|
| 479 | Matches any character, including newline. | 
|---|
| 480 |  | 
|---|
| 481 | `^' | 
|---|
| 482 | Matches the null string at beginning of line, i.e. what appears | 
|---|
| 483 | after the circumflex must appear at the beginning of line. | 
|---|
| 484 | `^#include' will match only lines where `#include' is the first | 
|---|
| 485 | thing on line--if there are spaces before, for example, the match | 
|---|
| 486 | fails.  `^' acts as a special character only at the beginning of | 
|---|
| 487 | the regular expression or subexpression (that is, after `\(' or | 
|---|
| 488 | `\|').  Portable scripts should avoid `^' at the beginning of a | 
|---|
| 489 | subexpression, though, as POSIX allows implementations that treat | 
|---|
| 490 | `^' as an ordinary character in that context. | 
|---|
| 491 |  | 
|---|
| 492 | `$' | 
|---|
| 493 | It is the same as `^', but refers to end of line.  `$' also acts | 
|---|
| 494 | as a special character only at the end of the regular expression | 
|---|
| 495 | or subexpression (that is, before `\)' or `\|'), and its use at | 
|---|
| 496 | the end of a subexpression is not portable. | 
|---|
| 497 |  | 
|---|
| 498 | `[LIST]' | 
|---|
| 499 | `[^LIST]' | 
|---|
| 500 | Matches any single character in LIST: for example, `[aeiou]' | 
|---|
| 501 | matches all vowels.  A list may include sequences like | 
|---|
| 502 | `CHAR1-CHAR2', which matches any character between (inclusive) | 
|---|
| 503 | CHAR1 and CHAR2. | 
|---|
| 504 |  | 
|---|
| 505 | A leading `^' reverses the meaning of LIST, so that it matches any | 
|---|
| 506 | single character _not_ in LIST.  To include `]' in the list, make | 
|---|
| 507 | it the first character (after the `^' if needed), to include `-' | 
|---|
| 508 | in the list, make it the first or last; to include `^' put it | 
|---|
| 509 | after the first character. | 
|---|
| 510 |  | 
|---|
| 511 | The characters `$', `*', `.', `[', and `\' are normally not | 
|---|
| 512 | special within LIST.  For example, `[\*]' matches either `\' or | 
|---|
| 513 | `*', because the `\' is not special here.  However, strings like | 
|---|
| 514 | `[.ch.]', `[=a=]', and `[:space:]' are special within LIST and | 
|---|
| 515 | represent collating symbols, equivalence classes, and character | 
|---|
| 516 | classes, respectively, and `[' is therefore special within LIST | 
|---|
| 517 | when it is followed by `.', `=', or `:'.  Also, when not in | 
|---|
| 518 | `POSIXLY_CORRECT' mode, special escapes like `\n' and `\t' are | 
|---|
| 519 | recognized within LIST.  *Note Escapes::. | 
|---|
| 520 |  | 
|---|
| 521 | `REGEXP1\|REGEXP2' | 
|---|
| 522 | Matches either REGEXP1 or REGEXP2.  Use parentheses to use complex | 
|---|
| 523 | alternative regular expressions.  The matching process tries each | 
|---|
| 524 | alternative in turn, from left to right, and the first one that | 
|---|
| 525 | succeeds is used.  It is a GNU extension. | 
|---|
| 526 |  | 
|---|
| 527 | `REGEXP1REGEXP2' | 
|---|
| 528 | Matches the concatenation of REGEXP1 and REGEXP2.  Concatenation | 
|---|
| 529 | binds more tightly than `\|', `^', and `$', but less tightly than | 
|---|
| 530 | the other regular expression operators. | 
|---|
| 531 |  | 
|---|
| 532 | `\DIGIT' | 
|---|
| 533 | Matches the DIGIT-th `\(...\)' parenthesized subexpression in the | 
|---|
| 534 | regular expression.  This is called a "back reference". | 
|---|
| 535 | Subexpressions are implicity numbered by counting occurrences of | 
|---|
| 536 | `\(' left-to-right. | 
|---|
| 537 |  | 
|---|
| 538 | `\n' | 
|---|
| 539 | Matches the newline character. | 
|---|
| 540 |  | 
|---|
| 541 | `\CHAR' | 
|---|
| 542 | Matches CHAR, where CHAR is one of `$', `*', `.', `[', `\', or `^'. | 
|---|
| 543 | Note that the only C-like backslash sequences that you can | 
|---|
| 544 | portably assume to be interpreted are `\n' and `\\'; in particular | 
|---|
| 545 | `\t' is not portable, and matches a `t' under most implementations | 
|---|
| 546 | of `sed', rather than a tab character. | 
|---|
| 547 |  | 
|---|
| 548 |  | 
|---|
| 549 | Note that the regular expression matcher is greedy, i.e., matches | 
|---|
| 550 | are attempted from left to right and, if two or more matches are | 
|---|
| 551 | possible starting at the same character, it selects the longest. | 
|---|
| 552 |  | 
|---|
| 553 | Examples: | 
|---|
| 554 | `abcdef' | 
|---|
| 555 | Matches `abcdef'. | 
|---|
| 556 |  | 
|---|
| 557 | `a*b' | 
|---|
| 558 | Matches zero or more `a's followed by a single `b'.  For example, | 
|---|
| 559 | `b' or `aaaaab'. | 
|---|
| 560 |  | 
|---|
| 561 | `a\?b' | 
|---|
| 562 | Matches `b' or `ab'. | 
|---|
| 563 |  | 
|---|
| 564 | `a\+b\+' | 
|---|
| 565 | Matches one or more `a's followed by one or more `b's: `ab' is the | 
|---|
| 566 | shortest possible match, but other examples are `aaaab' or | 
|---|
| 567 | `abbbbb' or `aaaaaabbbbbbb'. | 
|---|
| 568 |  | 
|---|
| 569 | `.*' | 
|---|
| 570 | `.\+' | 
|---|
| 571 | These two both match all the characters in a string; however, the | 
|---|
| 572 | first matches every string (including the empty string), while the | 
|---|
| 573 | second matches only strings containing at least one character. | 
|---|
| 574 |  | 
|---|
| 575 | `^main.*(.*)' | 
|---|
| 576 | his matches a string starting with `main', followed by an opening | 
|---|
| 577 | and closing parenthesis.  The `n', `(' and `)' need not be | 
|---|
| 578 | adjacent. | 
|---|
| 579 |  | 
|---|
| 580 | `^#' | 
|---|
| 581 | This matches a string beginning with `#'. | 
|---|
| 582 |  | 
|---|
| 583 | `\\$' | 
|---|
| 584 | This matches a string ending with a single backslash.  The regexp | 
|---|
| 585 | contains two backslashes for escaping. | 
|---|
| 586 |  | 
|---|
| 587 | `\$' | 
|---|
| 588 | Instead, this matches a string consisting of a single dollar sign, | 
|---|
| 589 | because it is escaped. | 
|---|
| 590 |  | 
|---|
| 591 | `[a-zA-Z0-9]' | 
|---|
| 592 | In the C locale, this matches any ASCII letters or digits. | 
|---|
| 593 |  | 
|---|
| 594 | `[^ tab]\+' | 
|---|
| 595 | (Here `tab' stands for a single tab character.)  This matches a | 
|---|
| 596 | string of one or more characters, none of which is a space or a | 
|---|
| 597 | tab.  Usually this means a word. | 
|---|
| 598 |  | 
|---|
| 599 | `^\(.*\)\n\1$' | 
|---|
| 600 | This matches a string consisting of two equal substrings separated | 
|---|
| 601 | by a newline. | 
|---|
| 602 |  | 
|---|
| 603 | `.\{9\}A$' | 
|---|
| 604 | This matches nine characters followed by an `A'. | 
|---|
| 605 |  | 
|---|
| 606 | `^.\{15\}A' | 
|---|
| 607 | This matches the start of a string that contains 16 characters, | 
|---|
| 608 | the last of which is an `A'. | 
|---|
| 609 |  | 
|---|
| 610 |  | 
|---|
| 611 |  | 
|---|
| 612 | File: sed.info,  Node: Common Commands,  Next: The "s" Command,  Prev: Regular Expressions,  Up: sed Programs | 
|---|
| 613 |  | 
|---|
| 614 | Often-Used Commands | 
|---|
| 615 | =================== | 
|---|
| 616 |  | 
|---|
| 617 | If you use `sed' at all, you will quite likely want to know these | 
|---|
| 618 | commands. | 
|---|
| 619 |  | 
|---|
| 620 | `#' | 
|---|
| 621 | [No addresses allowed.] | 
|---|
| 622 |  | 
|---|
| 623 | The `#' character begins a comment; the comment continues until | 
|---|
| 624 | the next newline. | 
|---|
| 625 |  | 
|---|
| 626 | If you are concerned about portability, be aware that some | 
|---|
| 627 | implementations of `sed' (which are not POSIX conformant) may only | 
|---|
| 628 | support a single one-line comment, and then only when the very | 
|---|
| 629 | first character of the script is a `#'. | 
|---|
| 630 |  | 
|---|
| 631 | Warning: if the first two characters of the `sed' script are `#n', | 
|---|
| 632 | then the `-n' (no-autoprint) option is forced.  If you want to put | 
|---|
| 633 | a comment in the first line of your script and that comment begins | 
|---|
| 634 | with the letter `n' and you do not want this behavior, then be | 
|---|
| 635 | sure to either use a capital `N', or place at least one space | 
|---|
| 636 | before the `n'. | 
|---|
| 637 |  | 
|---|
| 638 | `q [EXIT-CODE]' | 
|---|
| 639 | This command only accepts a single address. | 
|---|
| 640 |  | 
|---|
| 641 | Exit `sed' without processing any more commands or input.  Note | 
|---|
| 642 | that the current pattern space is printed if auto-print is not | 
|---|
| 643 | disabled with the `-n' options.  The ability to return an exit | 
|---|
| 644 | code from the `sed' script is a GNU `sed' extension. | 
|---|
| 645 |  | 
|---|
| 646 | `d' | 
|---|
| 647 | Delete the pattern space; immediately start next cycle. | 
|---|
| 648 |  | 
|---|
| 649 | `p' | 
|---|
| 650 | Print out the pattern space (to the standard output).  This | 
|---|
| 651 | command is usually only used in conjunction with the `-n' | 
|---|
| 652 | command-line option. | 
|---|
| 653 |  | 
|---|
| 654 | `n' | 
|---|
| 655 | If auto-print is not disabled, print the pattern space, then, | 
|---|
| 656 | regardless, replace the pattern space with the next line of input. | 
|---|
| 657 | If there is no more input then `sed' exits without processing any | 
|---|
| 658 | more commands. | 
|---|
| 659 |  | 
|---|
| 660 | `{ COMMANDS }' | 
|---|
| 661 | A group of commands may be enclosed between `{' and `}' characters. | 
|---|
| 662 | This is particularly useful when you want a group of commands to | 
|---|
| 663 | be triggered by a single address (or address-range) match. | 
|---|
| 664 |  | 
|---|
| 665 |  | 
|---|
| 666 |  | 
|---|
| 667 | File: sed.info,  Node: The "s" Command,  Next: Other Commands,  Prev: Common Commands,  Up: sed Programs | 
|---|
| 668 |  | 
|---|
| 669 | The `s' Command | 
|---|
| 670 | =============== | 
|---|
| 671 |  | 
|---|
| 672 | The syntax of the `s' (as in substitute) command is | 
|---|
| 673 | `s/REGEXP/REPLACEMENT/FLAGS'.  The `/' characters may be uniformly | 
|---|
| 674 | replaced by any other single character within any given `s' command. | 
|---|
| 675 | The `/' character (or whatever other character is used in its stead) | 
|---|
| 676 | can appear in the REGEXP or REPLACEMENT only if it is preceded by a `\' | 
|---|
| 677 | character. | 
|---|
| 678 |  | 
|---|
| 679 | The `s' command is probably the most important in `sed' and has a | 
|---|
| 680 | lot of different options.  Its basic concept is simple: the `s' command | 
|---|
| 681 | attempts to match the pattern space against the supplied REGEXP; if the | 
|---|
| 682 | match is successful, then that portion of the pattern space which was | 
|---|
| 683 | matched is replaced with REPLACEMENT. | 
|---|
| 684 |  | 
|---|
| 685 | The REPLACEMENT can contain `\N' (N being a number from 1 to 9, | 
|---|
| 686 | inclusive) references, which refer to the portion of the match which is | 
|---|
| 687 | contained between the Nth `\(' and its matching `\)'.  Also, the | 
|---|
| 688 | REPLACEMENT can contain unescaped `&' characters which reference the | 
|---|
| 689 | whole matched portion of the pattern space.  Finally, as a GNU `sed' | 
|---|
| 690 | extension, you can include a special sequence made of a backslash and | 
|---|
| 691 | one of the letters `L', `l', `U', `u', or `E'.  The meaning is as | 
|---|
| 692 | follows: | 
|---|
| 693 |  | 
|---|
| 694 | `\L' | 
|---|
| 695 | Turn the replacement to lowercase until a `\U' or `\E' is found, | 
|---|
| 696 |  | 
|---|
| 697 | `\l' | 
|---|
| 698 | Turn the next character to lowercase, | 
|---|
| 699 |  | 
|---|
| 700 | `\U' | 
|---|
| 701 | Turn the replacement to uppercase until a `\L' or `\E' is found, | 
|---|
| 702 |  | 
|---|
| 703 | `\u' | 
|---|
| 704 | Turn the next character to uppercase, | 
|---|
| 705 |  | 
|---|
| 706 | `\E' | 
|---|
| 707 | Stop case conversion started by `\L' or `\U'. | 
|---|
| 708 |  | 
|---|
| 709 | To include a literal `\', `&', or newline in the final replacement, | 
|---|
| 710 | be sure to precede the desired `\', `&', or newline in the REPLACEMENT | 
|---|
| 711 | with a `\'. | 
|---|
| 712 |  | 
|---|
| 713 | The `s' command can be followed by zero or more of the following | 
|---|
| 714 | FLAGS: | 
|---|
| 715 |  | 
|---|
| 716 | `g' | 
|---|
| 717 | Apply the replacement to _all_ matches to the REGEXP, not just the | 
|---|
| 718 | first. | 
|---|
| 719 |  | 
|---|
| 720 | `NUMBER' | 
|---|
| 721 | Only replace the NUMBERth match of the REGEXP. | 
|---|
| 722 |  | 
|---|
| 723 | Note: the POSIX standard does not specify what should happen when | 
|---|
| 724 | you mix the `g' and NUMBER modifiers, and currently there is no | 
|---|
| 725 | widely agreed upon meaning across `sed' implementations.  For GNU | 
|---|
| 726 | `sed', the interaction is defined to be: ignore matches before the | 
|---|
| 727 | NUMBERth, and then match and replace all matches from the NUMBERth | 
|---|
| 728 | on. | 
|---|
| 729 |  | 
|---|
| 730 | `p' | 
|---|
| 731 | If the substitution was made, then print the new pattern space. | 
|---|
| 732 |  | 
|---|
| 733 | Note: when both the `p' and `e' options are specified, the | 
|---|
| 734 | relative ordering of the two produces very different results.  In | 
|---|
| 735 | general, `ep' (evaluate then print) is what you want, but | 
|---|
| 736 | operating the other way round can be useful for debugging.  For | 
|---|
| 737 | this reason, the current version of GNU `sed' interprets specially | 
|---|
| 738 | the presence of `p' options both before and after `e', printing | 
|---|
| 739 | the pattern space before and after evaluation, while in general | 
|---|
| 740 | flags for the `s' command show their effect just once.  This | 
|---|
| 741 | behavior, although documented, might change in future versions. | 
|---|
| 742 |  | 
|---|
| 743 | `w FILE-NAME' | 
|---|
| 744 | If the substitution was made, then write out the result to the | 
|---|
| 745 | named file.  As a GNU `sed' extension, two special values of | 
|---|
| 746 | FILE-NAME are supported: `/dev/stderr', which writes the result to | 
|---|
| 747 | the standard error, and `/dev/stdout', which writes to the standard | 
|---|
| 748 | output.(1) | 
|---|
| 749 |  | 
|---|
| 750 | `e' | 
|---|
| 751 | This command allows one to pipe input from a shell command into | 
|---|
| 752 | pattern space.  If a substitution was made, the command that is | 
|---|
| 753 | found in pattern space is executed and pattern space is replaced | 
|---|
| 754 | with its output.  A trailing newline is suppressed; results are | 
|---|
| 755 | undefined if the command to be executed contains a NUL character. | 
|---|
| 756 | This is a GNU `sed' extension. | 
|---|
| 757 |  | 
|---|
| 758 | `I' | 
|---|
| 759 | `i' | 
|---|
| 760 | The `I' modifier to regular-expression matching is a GNU extension | 
|---|
| 761 | which makes `sed' match REGEXP in a case-insensitive manner. | 
|---|
| 762 |  | 
|---|
| 763 | `M' | 
|---|
| 764 | `m' | 
|---|
| 765 | The `M' modifier to regular-expression matching is a GNU `sed' | 
|---|
| 766 | extension which causes `^' and `$' to match respectively (in | 
|---|
| 767 | addition to the normal behavior) the empty string after a newline, | 
|---|
| 768 | and the empty string before a newline.  There are special character | 
|---|
| 769 | sequences (`\`' and `\'') which always match the beginning or the | 
|---|
| 770 | end of the buffer.  `M' stands for `multi-line'. | 
|---|
| 771 |  | 
|---|
| 772 |  | 
|---|
| 773 | ---------- Footnotes ---------- | 
|---|
| 774 |  | 
|---|
| 775 | (1) This is equivalent to `p' unless the `-i' option is being used. | 
|---|
| 776 |  | 
|---|
| 777 |  | 
|---|
| 778 | File: sed.info,  Node: Other Commands,  Next: Programming Commands,  Prev: The "s" Command,  Up: sed Programs | 
|---|
| 779 |  | 
|---|
| 780 | Less Frequently-Used Commands | 
|---|
| 781 | ============================= | 
|---|
| 782 |  | 
|---|
| 783 | Though perhaps less frequently used than those in the previous | 
|---|
| 784 | section, some very small yet useful `sed' scripts can be built with | 
|---|
| 785 | these commands. | 
|---|
| 786 |  | 
|---|
| 787 | `y/SOURCE-CHARS/DEST-CHARS/' | 
|---|
| 788 | (The `/' characters may be uniformly replaced by any other single | 
|---|
| 789 | character within any given `y' command.) | 
|---|
| 790 |  | 
|---|
| 791 | Transliterate any characters in the pattern space which match any | 
|---|
| 792 | of the SOURCE-CHARS with the corresponding character in DEST-CHARS. | 
|---|
| 793 |  | 
|---|
| 794 | Instances of the `/' (or whatever other character is used in its | 
|---|
| 795 | stead), `\', or newlines can appear in the SOURCE-CHARS or | 
|---|
| 796 | DEST-CHARS lists, provide that each instance is escaped by a `\'. | 
|---|
| 797 | The SOURCE-CHARS and DEST-CHARS lists _must_ contain the same | 
|---|
| 798 | number of characters (after de-escaping). | 
|---|
| 799 |  | 
|---|
| 800 | `a\' | 
|---|
| 801 | `TEXT' | 
|---|
| 802 | As a GNU extension, this command accepts two addresses. | 
|---|
| 803 |  | 
|---|
| 804 | Queue the lines of text which follow this command (each but the | 
|---|
| 805 | last ending with a `\', which are removed from the output) to be | 
|---|
| 806 | output at the end of the current cycle, or when the next input | 
|---|
| 807 | line is read. | 
|---|
| 808 |  | 
|---|
| 809 | Escape sequences in TEXT are processed, so you should use `\\' in | 
|---|
| 810 | TEXT to print a single backslash. | 
|---|
| 811 |  | 
|---|
| 812 | As a GNU extension, if between the `a' and the newline there is | 
|---|
| 813 | other than a whitespace-`\' sequence, then the text of this line, | 
|---|
| 814 | starting at the first non-whitespace character after the `a', is | 
|---|
| 815 | taken as the first line of the TEXT block.  (This enables a | 
|---|
| 816 | simplification in scripting a one-line add.)  This extension also | 
|---|
| 817 | works with the `i' and `c' commands. | 
|---|
| 818 |  | 
|---|
| 819 | `i\' | 
|---|
| 820 | `TEXT' | 
|---|
| 821 | As a GNU extension, this command accepts two addresses. | 
|---|
| 822 |  | 
|---|
| 823 | Immediately output the lines of text which follow this command | 
|---|
| 824 | (each but the last ending with a `\', which are removed from the | 
|---|
| 825 | output). | 
|---|
| 826 |  | 
|---|
| 827 | `c\' | 
|---|
| 828 | `TEXT' | 
|---|
| 829 | Delete the lines matching the address or address-range, and output | 
|---|
| 830 | the lines of text which follow this command (each but the last | 
|---|
| 831 | ending with a `\', which are removed from the output) in place of | 
|---|
| 832 | the last line (or in place of each line, if no addresses were | 
|---|
| 833 | specified).  A new cycle is started after this command is done, | 
|---|
| 834 | since the pattern space will have been deleted. | 
|---|
| 835 |  | 
|---|
| 836 | `=' | 
|---|
| 837 | As a GNU extension, this command accepts two addresses. | 
|---|
| 838 |  | 
|---|
| 839 | Print out the current input line number (with a trailing newline). | 
|---|
| 840 |  | 
|---|
| 841 | `l N' | 
|---|
| 842 | Print the pattern space in an unambiguous form: non-printable | 
|---|
| 843 | characters (and the `\' character) are printed in C-style escaped | 
|---|
| 844 | form; long lines are split, with a trailing `\' character to | 
|---|
| 845 | indicate the split; the end of each line is marked with a `$'. | 
|---|
| 846 |  | 
|---|
| 847 | N specifies the desired line-wrap length; a length of 0 (zero) | 
|---|
| 848 | means to never wrap long lines.  If omitted, the default as | 
|---|
| 849 | specified on the command line is used.  The N parameter is a GNU | 
|---|
| 850 | `sed' extension. | 
|---|
| 851 |  | 
|---|
| 852 | `r FILENAME' | 
|---|
| 853 | As a GNU extension, this command accepts two addresses. | 
|---|
| 854 |  | 
|---|
| 855 | Queue the contents of FILENAME to be read and inserted into the | 
|---|
| 856 | output stream at the end of the current cycle, or when the next | 
|---|
| 857 | input line is read.  Note that if FILENAME cannot be read, it is | 
|---|
| 858 | treated as if it were an empty file, without any error indication. | 
|---|
| 859 |  | 
|---|
| 860 | As a GNU `sed' extension, the special value `/dev/stdin' is | 
|---|
| 861 | supported for the file name, which reads the contents of the | 
|---|
| 862 | standard input. | 
|---|
| 863 |  | 
|---|
| 864 | `w FILENAME' | 
|---|
| 865 | Write the pattern space to FILENAME.  As a GNU `sed' extension, | 
|---|
| 866 | two special values of FILE-NAME are supported: `/dev/stderr', | 
|---|
| 867 | which writes the result to the standard error, and `/dev/stdout', | 
|---|
| 868 | which writes to the standard output.(1) | 
|---|
| 869 |  | 
|---|
| 870 | The file will be created (or truncated) before the first input | 
|---|
| 871 | line is read; all `w' commands (including instances of `w' flag on | 
|---|
| 872 | successful `s' commands) which refer to the same FILENAME are | 
|---|
| 873 | output without closing and reopening the file. | 
|---|
| 874 |  | 
|---|
| 875 | `D' | 
|---|
| 876 | Delete text in the pattern space up to the first newline.  If any | 
|---|
| 877 | text is left, restart cycle with the resultant pattern space | 
|---|
| 878 | (without reading a new line of input), otherwise start a normal | 
|---|
| 879 | new cycle. | 
|---|
| 880 |  | 
|---|
| 881 | `N' | 
|---|
| 882 | Add a newline to the pattern space, then append the next line of | 
|---|
| 883 | input to the pattern space.  If there is no more input then `sed' | 
|---|
| 884 | exits without processing any more commands. | 
|---|
| 885 |  | 
|---|
| 886 | `P' | 
|---|
| 887 | Print out the portion of the pattern space up to the first newline. | 
|---|
| 888 |  | 
|---|
| 889 | `h' | 
|---|
| 890 | Replace the contents of the hold space with the contents of the | 
|---|
| 891 | pattern space. | 
|---|
| 892 |  | 
|---|
| 893 | `H' | 
|---|
| 894 | Append a newline to the contents of the hold space, and then | 
|---|
| 895 | append the contents of the pattern space to that of the hold space. | 
|---|
| 896 |  | 
|---|
| 897 | `g' | 
|---|
| 898 | Replace the contents of the pattern space with the contents of the | 
|---|
| 899 | hold space. | 
|---|
| 900 |  | 
|---|
| 901 | `G' | 
|---|
| 902 | Append a newline to the contents of the pattern space, and then | 
|---|
| 903 | append the contents of the hold space to that of the pattern space. | 
|---|
| 904 |  | 
|---|
| 905 | `x' | 
|---|
| 906 | Exchange the contents of the hold and pattern spaces. | 
|---|
| 907 |  | 
|---|
| 908 |  | 
|---|
| 909 | ---------- Footnotes ---------- | 
|---|
| 910 |  | 
|---|
| 911 | (1) This is equivalent to `p' unless the `-i' option is being used. | 
|---|
| 912 |  | 
|---|
| 913 |  | 
|---|
| 914 | File: sed.info,  Node: Programming Commands,  Next: Extended Commands,  Prev: Other Commands,  Up: sed Programs | 
|---|
| 915 |  | 
|---|
| 916 | Commands for `sed' gurus | 
|---|
| 917 | ======================== | 
|---|
| 918 |  | 
|---|
| 919 | In most cases, use of these commands indicates that you are probably | 
|---|
| 920 | better off programming in something like `awk' or Perl.  But | 
|---|
| 921 | occasionally one is committed to sticking with `sed', and these | 
|---|
| 922 | commands can enable one to write quite convoluted scripts. | 
|---|
| 923 |  | 
|---|
| 924 | `: LABEL' | 
|---|
| 925 | [No addresses allowed.] | 
|---|
| 926 |  | 
|---|
| 927 | Specify the location of LABEL for branch commands.  In all other | 
|---|
| 928 | respects, a no-op. | 
|---|
| 929 |  | 
|---|
| 930 | `b LABEL' | 
|---|
| 931 | Unconditionally branch to LABEL.  The LABEL may be omitted, in | 
|---|
| 932 | which case the next cycle is started. | 
|---|
| 933 |  | 
|---|
| 934 | `t LABEL' | 
|---|
| 935 | Branch to LABEL only if there has been a successful `s'ubstitution | 
|---|
| 936 | since the last input line was read or conditional branch was taken. | 
|---|
| 937 | The LABEL may be omitted, in which case the next cycle is started. | 
|---|
| 938 |  | 
|---|
| 939 |  | 
|---|
| 940 |  | 
|---|
| 941 | File: sed.info,  Node: Extended Commands,  Next: Escapes,  Prev: Programming Commands,  Up: sed Programs | 
|---|
| 942 |  | 
|---|
| 943 | Commands Specific to GNU `sed' | 
|---|
| 944 | ============================== | 
|---|
| 945 |  | 
|---|
| 946 | These commands are specific to GNU `sed', so you must use them with | 
|---|
| 947 | care and only when you are sure that hindering portability is not evil. | 
|---|
| 948 | They allow you to check for GNU `sed' extensions or to do tasks that | 
|---|
| 949 | are required quite often, yet are unsupported by standard `sed's. | 
|---|
| 950 |  | 
|---|
| 951 | `e [COMMAND]' | 
|---|
| 952 | This command allows one to pipe input from a shell command into | 
|---|
| 953 | pattern space.  Without parameters, the `e' command executes the | 
|---|
| 954 | command that is found in pattern space and replaces the pattern | 
|---|
| 955 | space with the output; a trailing newline is suppressed. | 
|---|
| 956 |  | 
|---|
| 957 | If a parameter is specified, instead, the `e' command interprets | 
|---|
| 958 | it as a command and sends its output to the output stream (like | 
|---|
| 959 | `r' does).  The command can run across multiple lines, all but the | 
|---|
| 960 | last ending with a back-slash. | 
|---|
| 961 |  | 
|---|
| 962 | In both cases, the results are undefined if the command to be | 
|---|
| 963 | executed contains a NUL character. | 
|---|
| 964 |  | 
|---|
| 965 | `L N' | 
|---|
| 966 | This GNU `sed' extension fills and joins lines in pattern space to | 
|---|
| 967 | produce output lines of (at most) N characters, like `fmt' does; | 
|---|
| 968 | if N is omitted, the default as specified on the command line is | 
|---|
| 969 | used.  This command is considered a failed experiment and unless | 
|---|
| 970 | there is enough request (which seems unlikely) will be removed in | 
|---|
| 971 | future versions. | 
|---|
| 972 |  | 
|---|
| 973 | `Q [EXIT-CODE]' | 
|---|
| 974 | This command only accepts a single address. | 
|---|
| 975 |  | 
|---|
| 976 | This command is the same as `q', but will not print the contents | 
|---|
| 977 | of pattern space.  Like `q', it provides the ability to return an | 
|---|
| 978 | exit code to the caller. | 
|---|
| 979 |  | 
|---|
| 980 | This command can be useful because the only alternative ways to | 
|---|
| 981 | accomplish this apparently trivial function are to use the `-n' | 
|---|
| 982 | option (which can unnecessarily complicate your script) or | 
|---|
| 983 | resorting to the following snippet, which wastes time by reading | 
|---|
| 984 | the whole file without any visible effect: | 
|---|
| 985 |  | 
|---|
| 986 | :eat | 
|---|
| 987 | $d       Quit silently on the last line | 
|---|
| 988 | N        Read another line, silently | 
|---|
| 989 | g        Overwrite pattern space each time to save memory | 
|---|
| 990 | b eat | 
|---|
| 991 |  | 
|---|
| 992 | `R FILENAME' | 
|---|
| 993 | Queue a line of FILENAME to be read and inserted into the output | 
|---|
| 994 | stream at the end of the current cycle, or when the next input | 
|---|
| 995 | line is read.  Note that if FILENAME cannot be read, or if its end | 
|---|
| 996 | is reached, no line is appended, without any error indication. | 
|---|
| 997 |  | 
|---|
| 998 | As with the `r' command, the special value `/dev/stdin' is | 
|---|
| 999 | supported for the file name, which reads a line from the standard | 
|---|
| 1000 | input. | 
|---|
| 1001 |  | 
|---|
| 1002 | `T LABEL' | 
|---|
| 1003 | Branch to LABEL only if there have been no successful | 
|---|
| 1004 | `s'ubstitutions since the last input line was read or conditional | 
|---|
| 1005 | branch was taken. The LABEL may be omitted, in which case the next | 
|---|
| 1006 | cycle is started. | 
|---|
| 1007 |  | 
|---|
| 1008 | `v VERSION' | 
|---|
| 1009 | This command does nothing, but makes `sed' fail if GNU `sed' | 
|---|
| 1010 | extensions are not supported, simply because other versions of | 
|---|
| 1011 | `sed' do not implement it.  In addition, you can specify the | 
|---|
| 1012 | version of `sed' that your script requires, such as `4.0.5'.  The | 
|---|
| 1013 | default is `4.0' because that is the first version that | 
|---|
| 1014 | implemented this command. | 
|---|
| 1015 |  | 
|---|
| 1016 | This command enables all GNU extensions even if `POSIXLY_CORRECT' | 
|---|
| 1017 | is set in the environment. | 
|---|
| 1018 |  | 
|---|
| 1019 | `W FILENAME' | 
|---|
| 1020 | Write to the given filename the portion of the pattern space up to | 
|---|
| 1021 | the first newline.  Everything said under the `w' command about | 
|---|
| 1022 | file handling holds here too. | 
|---|
| 1023 |  | 
|---|
| 1024 |  | 
|---|
| 1025 | File: sed.info,  Node: Escapes,  Prev: Extended Commands,  Up: sed Programs | 
|---|
| 1026 |  | 
|---|
| 1027 | GNU Extensions for Escapes in Regular Expressions | 
|---|
| 1028 | ================================================= | 
|---|
| 1029 |  | 
|---|
| 1030 | Until this chapter, we have only encountered escapes of the form | 
|---|
| 1031 | `\^', which tell `sed' not to interpret the circumflex as a special | 
|---|
| 1032 | character, but rather to take it literally.  For example, `\*' matches | 
|---|
| 1033 | a single asterisk rather than zero or more backslashes. | 
|---|
| 1034 |  | 
|---|
| 1035 | This chapter introduces another kind of escape(1)--that is, escapes | 
|---|
| 1036 | that are applied to a character or sequence of characters that | 
|---|
| 1037 | ordinarily are taken literally, and that `sed' replaces with a special | 
|---|
| 1038 | character.  This provides a way of encoding non-printable characters in | 
|---|
| 1039 | patterns in a visible manner.  There is no restriction on the | 
|---|
| 1040 | appearance of non-printing characters in a `sed' script but when a | 
|---|
| 1041 | script is being prepared in the shell or by text editing, it is usually | 
|---|
| 1042 | easier to use one of the following escape sequences than the binary | 
|---|
| 1043 | character it represents: | 
|---|
| 1044 |  | 
|---|
| 1045 | The list of these escapes is: | 
|---|
| 1046 |  | 
|---|
| 1047 | `\a' | 
|---|
| 1048 | Produces or matches a BEL character, that is an "alert" (ASCII 7). | 
|---|
| 1049 |  | 
|---|
| 1050 | `\f' | 
|---|
| 1051 | Produces or matches a form feed (ASCII 12). | 
|---|
| 1052 |  | 
|---|
| 1053 | `\n' | 
|---|
| 1054 | Produces or matches a newline (ASCII 10). | 
|---|
| 1055 |  | 
|---|
| 1056 | `\r' | 
|---|
| 1057 | Produces or matches a carriage return (ASCII 13). | 
|---|
| 1058 |  | 
|---|
| 1059 | `\t' | 
|---|
| 1060 | Produces or matches a horizontal tab (ASCII 9). | 
|---|
| 1061 |  | 
|---|
| 1062 | `\v' | 
|---|
| 1063 | Produces or matches a so called "vertical tab" (ASCII 11). | 
|---|
| 1064 |  | 
|---|
| 1065 | `\cX' | 
|---|
| 1066 | Produces or matches `CONTROL-X', where X is any character.  The | 
|---|
| 1067 | precise effect of `\cX' is as follows: if X is a lower case | 
|---|
| 1068 | letter, it is converted to upper case.  Then bit 6 of the | 
|---|
| 1069 | character (hex 40) is inverted.  Thus `\cz' becomes hex 1A, but | 
|---|
| 1070 | `\c{' becomes hex 3B, while `\c;' becomes hex 7B. | 
|---|
| 1071 |  | 
|---|
| 1072 | `\dXXX' | 
|---|
| 1073 | Produces or matches a character whose decimal ASCII value is XXX. | 
|---|
| 1074 |  | 
|---|
| 1075 | `\oXXX' | 
|---|
| 1076 | Produces or matches a character whose octal ASCII value is XXX. | 
|---|
| 1077 |  | 
|---|
| 1078 | `\xXX' | 
|---|
| 1079 | Produces or matches a character whose hexadecimal ASCII value is | 
|---|
| 1080 | XX. | 
|---|
| 1081 |  | 
|---|
| 1082 | `\b' (backspace) was omitted because of the conflict with the | 
|---|
| 1083 | existing "word boundary" meaning. | 
|---|
| 1084 |  | 
|---|
| 1085 | Other escapes match a particular character class and are valid only | 
|---|
| 1086 | in regular expressions: | 
|---|
| 1087 |  | 
|---|
| 1088 | `\w' | 
|---|
| 1089 | Matches any "word" character.  A "word" character is any letter or | 
|---|
| 1090 | digit or the underscore character. | 
|---|
| 1091 |  | 
|---|
| 1092 | `\W' | 
|---|
| 1093 | Matches any "non-word" character. | 
|---|
| 1094 |  | 
|---|
| 1095 | `\b' | 
|---|
| 1096 | Matches a word boundary; that is it matches if the character to | 
|---|
| 1097 | the left is a "word" character and the character to the right is a | 
|---|
| 1098 | "non-word" character, or vice-versa. | 
|---|
| 1099 |  | 
|---|
| 1100 | `\B' | 
|---|
| 1101 | Matches everywhere but on a word boundary; that is it matches if | 
|---|
| 1102 | the character to the left and the character to the right are | 
|---|
| 1103 | either both "word" characters or both "non-word" characters. | 
|---|
| 1104 |  | 
|---|
| 1105 | `\`' | 
|---|
| 1106 | Matches only at the start of pattern space.  This is different | 
|---|
| 1107 | from `^' in multi-line mode. | 
|---|
| 1108 |  | 
|---|
| 1109 | `\'' | 
|---|
| 1110 | Matches only at the end of pattern space.  This is different from | 
|---|
| 1111 | `$' in multi-line mode. | 
|---|
| 1112 |  | 
|---|
| 1113 |  | 
|---|
| 1114 | ---------- Footnotes ---------- | 
|---|
| 1115 |  | 
|---|
| 1116 | (1) All the escapes introduced here are GNU extensions, with the | 
|---|
| 1117 | exception of `\n'.  In basic regular expression mode, setting | 
|---|
| 1118 | `POSIXLY_CORRECT' disables them inside bracket expressions. | 
|---|
| 1119 |  | 
|---|
| 1120 |  | 
|---|
| 1121 | File: sed.info,  Node: Examples,  Next: Limitations,  Prev: sed Programs,  Up: Top | 
|---|
| 1122 |  | 
|---|
| 1123 | Some Sample Scripts | 
|---|
| 1124 | ******************* | 
|---|
| 1125 |  | 
|---|
| 1126 | Here are some `sed' scripts to guide you in the art of mastering | 
|---|
| 1127 | `sed'. | 
|---|
| 1128 |  | 
|---|
| 1129 | * Menu: | 
|---|
| 1130 |  | 
|---|
| 1131 | Some exotic examples: | 
|---|
| 1132 | * Centering lines:: | 
|---|
| 1133 | * Increment a number:: | 
|---|
| 1134 | * Rename files to lower case:: | 
|---|
| 1135 | * Print bash environment:: | 
|---|
| 1136 | * Reverse chars of lines:: | 
|---|
| 1137 |  | 
|---|
| 1138 | Emulating standard utilities: | 
|---|
| 1139 | * tac::                             Reverse lines of files | 
|---|
| 1140 | * cat -n::                          Numbering lines | 
|---|
| 1141 | * cat -b::                          Numbering non-blank lines | 
|---|
| 1142 | * wc -c::                           Counting chars | 
|---|
| 1143 | * wc -w::                           Counting words | 
|---|
| 1144 | * wc -l::                           Counting lines | 
|---|
| 1145 | * head::                            Printing the first lines | 
|---|
| 1146 | * tail::                            Printing the last lines | 
|---|
| 1147 | * uniq::                            Make duplicate lines unique | 
|---|
| 1148 | * uniq -d::                         Print duplicated lines of input | 
|---|
| 1149 | * uniq -u::                         Remove all duplicated lines | 
|---|
| 1150 | * cat -s::                          Squeezing blank lines | 
|---|
| 1151 |  | 
|---|
| 1152 |  | 
|---|
| 1153 | File: sed.info,  Node: Centering lines,  Next: Increment a number,  Up: Examples | 
|---|
| 1154 |  | 
|---|
| 1155 | Centering Lines | 
|---|
| 1156 | =============== | 
|---|
| 1157 |  | 
|---|
| 1158 | This script centers all lines of a file on a 80 columns width.  To | 
|---|
| 1159 | change that width, the number in `\{...\}' must be replaced, and the | 
|---|
| 1160 | number of added spaces also must be changed. | 
|---|
| 1161 |  | 
|---|
| 1162 | Note how the buffer commands are used to separate parts in the | 
|---|
| 1163 | regular expressions to be matched--this is a common technique. | 
|---|
| 1164 |  | 
|---|
| 1165 | #!/usr/bin/sed -f | 
|---|
| 1166 |  | 
|---|
| 1167 | # Put 80 spaces in the buffer | 
|---|
| 1168 | 1 { | 
|---|
| 1169 | x | 
|---|
| 1170 | s/^$/          / | 
|---|
| 1171 | s/^.*$/&&&&&&&&/ | 
|---|
| 1172 | x | 
|---|
| 1173 | } | 
|---|
| 1174 |  | 
|---|
| 1175 | # del leading and trailing spaces | 
|---|
| 1176 | y/tab/ / | 
|---|
| 1177 | s/^ *// | 
|---|
| 1178 | s/ *$// | 
|---|
| 1179 |  | 
|---|
| 1180 | # add a newline and 80 spaces to end of line | 
|---|
| 1181 | G | 
|---|
| 1182 |  | 
|---|
| 1183 | # keep first 81 chars (80 + a newline) | 
|---|
| 1184 | s/^\(.\{81\}\).*$/\1/ | 
|---|
| 1185 |  | 
|---|
| 1186 | # \2 matches half of the spaces, which are moved to the beginning | 
|---|
| 1187 | s/^\(.*\)\n\(.*\)\2/\2\1/ | 
|---|
| 1188 |  | 
|---|
| 1189 |  | 
|---|
| 1190 | File: sed.info,  Node: Increment a number,  Next: Rename files to lower case,  Prev: Centering lines,  Up: Examples | 
|---|
| 1191 |  | 
|---|
| 1192 | Increment a Number | 
|---|
| 1193 | ================== | 
|---|
| 1194 |  | 
|---|
| 1195 | This script is one of a few that demonstrate how to do arithmetic in | 
|---|
| 1196 | `sed'.  This is indeed possible,(1) but must be done manually. | 
|---|
| 1197 |  | 
|---|
| 1198 | To increment one number you just add 1 to last digit, replacing it | 
|---|
| 1199 | by the following digit.  There is one exception: when the digit is a | 
|---|
| 1200 | nine the previous digits must be also incremented until you don't have | 
|---|
| 1201 | a nine. | 
|---|
| 1202 |  | 
|---|
| 1203 | This solution by Bruno Haible is very clever and smart because it | 
|---|
| 1204 | uses a single buffer; if you don't have this limitation, the algorithm | 
|---|
| 1205 | used in *Note Numbering lines: cat -n, is faster.  It works by | 
|---|
| 1206 | replacing trailing nines with an underscore, then using multiple `s' | 
|---|
| 1207 | commands to increment the last digit, and then again substituting | 
|---|
| 1208 | underscores with zeros. | 
|---|
| 1209 |  | 
|---|
| 1210 | #!/usr/bin/sed -f | 
|---|
| 1211 |  | 
|---|
| 1212 | /[^0-9]/ d | 
|---|
| 1213 |  | 
|---|
| 1214 | # replace all leading 9s by _ (any other character except digits, could | 
|---|
| 1215 | # be used) | 
|---|
| 1216 | :d | 
|---|
| 1217 | s/9\(_*\)$/_\1/ | 
|---|
| 1218 | td | 
|---|
| 1219 |  | 
|---|
| 1220 | # incr last digit only.  The first line adds a most-significant | 
|---|
| 1221 | # digit of 1 if we have to add a digit. | 
|---|
| 1222 | # | 
|---|
| 1223 | # The `tn' commands are not necessary, but make the thing | 
|---|
| 1224 | # faster | 
|---|
| 1225 |  | 
|---|
| 1226 | s/^\(_*\)$/1\1/; tn | 
|---|
| 1227 | s/8\(_*\)$/9\1/; tn | 
|---|
| 1228 | s/7\(_*\)$/8\1/; tn | 
|---|
| 1229 | s/6\(_*\)$/7\1/; tn | 
|---|
| 1230 | s/5\(_*\)$/6\1/; tn | 
|---|
| 1231 | s/4\(_*\)$/5\1/; tn | 
|---|
| 1232 | s/3\(_*\)$/4\1/; tn | 
|---|
| 1233 | s/2\(_*\)$/3\1/; tn | 
|---|
| 1234 | s/1\(_*\)$/2\1/; tn | 
|---|
| 1235 | s/0\(_*\)$/1\1/; tn | 
|---|
| 1236 |  | 
|---|
| 1237 | :n | 
|---|
| 1238 | y/_/0/ | 
|---|
| 1239 |  | 
|---|
| 1240 | ---------- Footnotes ---------- | 
|---|
| 1241 |  | 
|---|
| 1242 | (1) `sed' guru Greg Ubben wrote an implementation of the `dc' RPN | 
|---|
| 1243 | calculator!  It is distributed together with sed. | 
|---|
| 1244 |  | 
|---|
| 1245 |  | 
|---|
| 1246 | File: sed.info,  Node: Rename files to lower case,  Next: Print bash environment,  Prev: Increment a number,  Up: Examples | 
|---|
| 1247 |  | 
|---|
| 1248 | Rename Files to Lower Case | 
|---|
| 1249 | ========================== | 
|---|
| 1250 |  | 
|---|
| 1251 | This is a pretty strange use of `sed'.  We transform text, and | 
|---|
| 1252 | transform it to be shell commands, then just feed them to shell.  Don't | 
|---|
| 1253 | worry, even worse hacks are done when using `sed'; I have seen a script | 
|---|
| 1254 | converting the output of `date' into a `bc' program! | 
|---|
| 1255 |  | 
|---|
| 1256 | The main body of this is the `sed' script, which remaps the name | 
|---|
| 1257 | from lower to upper (or vice-versa) and even checks out if the remapped | 
|---|
| 1258 | name is the same as the original name.  Note how the script is | 
|---|
| 1259 | parameterized using shell variables and proper quoting. | 
|---|
| 1260 |  | 
|---|
| 1261 | #! /bin/sh | 
|---|
| 1262 | # rename files to lower/upper case... | 
|---|
| 1263 | # | 
|---|
| 1264 | # usage: | 
|---|
| 1265 | #    move-to-lower * | 
|---|
| 1266 | #    move-to-upper * | 
|---|
| 1267 | # or | 
|---|
| 1268 | #    move-to-lower -R . | 
|---|
| 1269 | #    move-to-upper -R . | 
|---|
| 1270 | # | 
|---|
| 1271 |  | 
|---|
| 1272 | help() | 
|---|
| 1273 | { | 
|---|
| 1274 | cat << eof | 
|---|
| 1275 | Usage: $0 [-n] [-r] [-h] files... | 
|---|
| 1276 |  | 
|---|
| 1277 | -n      do nothing, only see what would be done | 
|---|
| 1278 | -R      recursive (use find) | 
|---|
| 1279 | -h      this message | 
|---|
| 1280 | files   files to remap to lower case | 
|---|
| 1281 |  | 
|---|
| 1282 | Examples: | 
|---|
| 1283 | $0 -n *        (see if everything is ok, then...) | 
|---|
| 1284 | $0 * | 
|---|
| 1285 |  | 
|---|
| 1286 | $0 -R . | 
|---|
| 1287 |  | 
|---|
| 1288 | eof | 
|---|
| 1289 | } | 
|---|
| 1290 |  | 
|---|
| 1291 | apply_cmd='sh' | 
|---|
| 1292 | finder='echo "$@" | tr " " "\n"' | 
|---|
| 1293 | files_only= | 
|---|
| 1294 |  | 
|---|
| 1295 | while : | 
|---|
| 1296 | do | 
|---|
| 1297 | case "$1" in | 
|---|
| 1298 | -n) apply_cmd='cat' ;; | 
|---|
| 1299 | -R) finder='find "$@" -type f';; | 
|---|
| 1300 | -h) help ; exit 1 ;; | 
|---|
| 1301 | *) break ;; | 
|---|
| 1302 | esac | 
|---|
| 1303 | shift | 
|---|
| 1304 | done | 
|---|
| 1305 |  | 
|---|
| 1306 | if [ -z "$1" ]; then | 
|---|
| 1307 | echo Usage: $0 [-h] [-n] [-r] files... | 
|---|
| 1308 | exit 1 | 
|---|
| 1309 | fi | 
|---|
| 1310 |  | 
|---|
| 1311 | LOWER='abcdefghijklmnopqrstuvwxyz' | 
|---|
| 1312 | UPPER='ABCDEFGHIJKLMNOPQRSTUVWXYZ' | 
|---|
| 1313 |  | 
|---|
| 1314 | case `basename $0` in | 
|---|
| 1315 | *upper*) TO=$UPPER; FROM=$LOWER ;; | 
|---|
| 1316 | *)       FROM=$UPPER; TO=$LOWER ;; | 
|---|
| 1317 | esac | 
|---|
| 1318 |  | 
|---|
| 1319 | eval $finder | sed -n ' | 
|---|
| 1320 |  | 
|---|
| 1321 | # remove all trailing slashes | 
|---|
| 1322 | s/\/*$// | 
|---|
| 1323 |  | 
|---|
| 1324 | # add ./ if there is no path, only a filename | 
|---|
| 1325 | /\//! s/^/.\// | 
|---|
| 1326 |  | 
|---|
| 1327 | # save path+filename | 
|---|
| 1328 | h | 
|---|
| 1329 |  | 
|---|
| 1330 | # remove path | 
|---|
| 1331 | s/.*\/// | 
|---|
| 1332 |  | 
|---|
| 1333 | # do conversion only on filename | 
|---|
| 1334 | y/'$FROM'/'$TO'/ | 
|---|
| 1335 |  | 
|---|
| 1336 | # now line contains original path+file, while | 
|---|
| 1337 | # hold space contains the new filename | 
|---|
| 1338 | x | 
|---|
| 1339 |  | 
|---|
| 1340 | # add converted file name to line, which now contains | 
|---|
| 1341 | # path/file-name\nconverted-file-name | 
|---|
| 1342 | G | 
|---|
| 1343 |  | 
|---|
| 1344 | # check if converted file name is equal to original file name, | 
|---|
| 1345 | # if it is, do not print nothing | 
|---|
| 1346 | /^.*\/\(.*\)\n\1/b | 
|---|
| 1347 |  | 
|---|
| 1348 | # now, transform path/fromfile\n, into | 
|---|
| 1349 | # mv path/fromfile path/tofile and print it | 
|---|
| 1350 | s/^\(.*\/\)\(.*\)\n\(.*\)$/mv "\1\2" "\1\3"/p | 
|---|
| 1351 |  | 
|---|
| 1352 | ' | $apply_cmd | 
|---|
| 1353 |  | 
|---|