Context Navigation

← Previous Change
Next Change →

Changeset 3613 for trunk/src/sed/doc

Timestamp:

Sep 19, 2024, 2:34:43 AM (14 months ago)

Author:

bird

Message:

src/sed: Merged in changes between 4.1.5 and 4.9 from the vendor branch. (svn merge ^{/vendor/sed/4.1.5}/vendor/sed/current .)

Location:

trunk/src/sed

Files:

: 6 deleted
: 8 edited
: 4 copied

. (modified) (1 prop)
doc/Makefile.am (deleted)
doc/Makefile.in (deleted)
doc/config.texi (modified) (1 diff)
doc/dummy-man (copied) (copied from vendor/sed/current/doc/dummy-man )
doc/fdl.texi (copied) (copied from vendor/sed/current/doc/fdl.texi )
doc/groupify.sed (deleted)
doc/local.mk (copied) (copied from vendor/sed/current/doc/local.mk )
doc/sed-dummy.1 (copied) (copied from vendor/sed/current/doc/sed-dummy.1 )
doc/sed-in.texi (deleted)
doc/sed.1 (modified) (18 diffs)
doc/sed.info (modified) (2 diffs)
doc/sed.info-1 (deleted)
doc/sed.info-2 (deleted)
doc/sed.texi (modified) (73 diffs)
doc/sed.x (modified) (15 diffs)
doc/stamp-vti (modified) (1 diff)
doc/version.texi (modified) (1 diff)

Legend:

: Unmodified
: Added
: Removed

trunk/src/sed
- Property svn:mergeinfo set to
  /vendor/sed/current merged eligible

trunk/src/sed/doc/config.texi

-              r599
+              r3613
 @clear PERL
+@set SSEDEXT @acronym{GNU} extensions
+@set SSED @acronym{GNU} @command{sed}
+@set SSEDEXT GNU extensions
+@set SSED GNU @command{sed}
+@c Ugly hack to enable using new texinfo commands '@codequotebacktick'
+@c and '@codequoteundirected' or define empty fallbacks if they are
+@c not available.
+@ifclear txicommandconditionals
+@c If we got here, this is a REALLY old texinfo (pre 5.0),
+@c and '@ifcommandnotdefined' is not defined.
+@c Assume these commands are not defined as well.
+@macro codequotebacktick
+@end macro
+@macro codequoteundirected
+@end macro
+@end ifclear
+@ifset txicommandconditionals
+@c if we got here, this texinfo supports checking for defined
+@c commands. If these commands aren't available - define empty
+@c fallbacks.
+@ifcommandnotdefined codequotebacktick
+@macro codequotebacktick
+@end macro
+@macro codequoteundirected
+@end macro
+@end ifcommandnotdefined
+@end ifset
+@c define variables that will render as characters
+@c on both HTML (with @U{}) and PDF (with greek symbols).
+@c Use with: @value{ucsigma}
+@c
+@c Based on:
+@c https://lists.gnu.org/archive/html/help-texinfo/2012-06/msg00004.html
+@iftex
+@set ucsigma @math{@Sigma{}}
+@end iftex
+@ifnottex
+@set ucsigma @U{03A3}
+@end ifnottex
+@iftex
+@set lcsigma @math{@sigma{}}
+@end iftex
+@ifnottex
+@set lcsigma @U{03C3}
+@end ifnottex
+@c Unicode Replacement Character (U+FFFD):
+@c no easy/portable tex equivalent, so use another
+@c distinct symbol (which will be rendered very differently
+@c than ascii characters in @examples.
+@iftex
+@set unicodeFFFD @math{@otimes{}}
+@end iftex
+@ifnottex
+@set unicodeFFFD @U{FFFD}
+@end ifnottex

trunk/src/sed/doc/sed.1

-              r599
+              r3613
 .\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.28.
 .TH SED "1" "February 2006" "sed version 4.1.4" "User Commands"
+.\" DO NOT MODIFY THIS FILE!  It was generated by help2man 1.48.5.
+.TH SED "1" "November 2022" "GNU sed 4.9" "User Commands"
 .SH NAME
 sed \- stream editor for filtering and transforming text
 .SH SYNOPSIS
+.B sed
+[\fIOPTION\fR]... \fI{script-only-if-no-other-script} \fR[\fIinput-file\fR]...
+.nf
+sed [-V] [--version] [--help] [-n] [--quiet] [--silent]
+    [-l N] [--line-length=N] [-u] [--unbuffered]
+    [-E] [-r] [--regexp-extended]
+    [-e script] [--expression=script]
+    [-f script-file] [--file=script-file]
+    [script-if-no-other-script]
+    [file...]
+.fi
 .SH DESCRIPTION
 .ds sd \fIsed\fP
 …
 suppress automatic printing of pattern space
 .HP
+\fB\-e\fR script, \fB\-\-expression\fR=\fIscript\fR
+\fB\-\-debug\fR
+.IP
+annotate program execution
+.HP
+\fB\-e\fR script, \fB\-\-expression\fR=\fI\,script\/\fR
 .IP
 add the script to the commands to be executed
 .HP
+\fB\-f\fR script-file, \fB\-\-file\fR=\fIscript\-file\fR
+.IP
+add the contents of script-file to the commands to be executed
+.HP
+\fB\-i[SUFFIX]\fR, \fB\-\-in\-place\fR[=\fISUFFIX\fR]
+.IP
+edit files in place (makes backup if extension supplied)
+.HP
+\fB\-l\fR N, \fB\-\-line\-length\fR=\fIN\fR
+.IP
+specify the desired line-wrap length for the `l' command
+\fB\-f\fR script\-file, \fB\-\-file\fR=\fI\,script\-file\/\fR
+.IP
+add the contents of script\-file to the commands to be executed
+.HP
+\fB\-\-follow\-symlinks\fR
+.IP
+follow symlinks when processing in place
+.HP
+\fB\-i[SUFFIX]\fR, \fB\-\-in\-place\fR[=\fI\,SUFFIX\/\fR]
+.IP
+edit files in place (makes backup if SUFFIX supplied)
+.HP
+\fB\-l\fR N, \fB\-\-line\-length\fR=\fI\,N\/\fR
+.IP
+specify the desired line\-wrap length for the `l' command
 .HP
 \fB\-\-posix\fR
 …
 disable all GNU extensions.
 .HP
+\fB\-r\fR, \fB\-\-regexp\-extended\fR
+.IP
+use extended regular expressions in the script.
+\fB\-E\fR, \fB\-r\fR, \fB\-\-regexp\-extended\fR
+.IP
+use extended regular expressions in the script
+(for portability use POSIX \fB\-E\fR).
 .HP
 \fB\-s\fR, \fB\-\-separate\fR
 .IP
+consider files as separate rather than as a single continuous
+long stream.
+consider files as separate rather than as a single,
+continuous long stream.
+.HP
+\fB\-\-sandbox\fR
+.IP
+operate in sandbox mode (disable e/r/w commands).
 .HP
 \fB\-u\fR, \fB\-\-unbuffered\fR
 …
 load minimal amounts of data from the input files and flush
 the output buffers more often
+.HP
+\fB\-z\fR, \fB\-\-null\-data\fR
+.IP
+separate lines by NUL characters
 .TP
 \fB\-\-help\fR
 …
 .PP
 If no \fB\-e\fR, \fB\-\-expression\fR, \fB\-f\fR, or \fB\-\-file\fR option is given, then the first
 non-option argument is taken as the sed script to interpret.  All
+non\-option argument is taken as the sed script to interpret.  All
 remaining arguments are names of input files; if no input files are
 specified, then the standard input is read.
 .PP
+E-mail bug reports to: bonzini@gnu.org .
+Be sure to include the word ``sed'' somewhere in the ``Subject:'' field.
+GNU sed home page: <https://www.gnu.org/software/sed/>.
+General help using GNU software: <https://www.gnu.org/gethelp/>.
+E\-mail bug reports to: <bug\-sed@gnu.org>.
 .SH "COMMAND SYNOPSIS"
 This is just a brief synopsis of \*(sd commands to serve as
 …
 .RI # comment
 The comment extends until the next newline (or the end of a
 .B -e
+.B \-e
 script fragment).
 .TP
 …
 which has each embedded newline preceded by a backslash.
 .TP
+q
+q [\fIexit-code\fR]
 Immediately quit the \*(sd script without processing
 any more input,
+except that if auto-print is not disabled
 the current pattern space will be printed.
 .TP
+Q
+any more input, except that if auto-print is not disabled
+the current pattern space will be printed.  The exit code
+argument is a GNU extension.
+.TP
+Q [\fIexit-code\fR]
 Immediately quit the \*(sd script without processing
 any more input.
+any more input.  This is a GNU extension.
 .TP
 .RI r\  filename
 …
 Append a line read from
 .IR filename .
+Each invocation of the command reads a line from the file.
+This is a GNU extension.
 .SS
 Commands which accept address ranges
 …
 is omitted, branch to end of script.
 .TP
-.RI t\  label
-If a s/// has done a successful substitution since the
-last input line was read and since the last t or T
-command, then branch to
-.IR label ;
-if
-.I label
-is omitted, branch to end of script.
-.TP
-.RI T\  label
-If no s/// has done a successful substitution since the
-last input line was read and since the last t or T
-command, then branch to
-.IR label ;
-if
-.I label
-is omitted, branch to end of script.
-.TP
 c \e
 .TP
 …
 .TP
+D
+Delete up to the first embedded newline in the pattern space.
+Start next cycle, but skip reading from the input
+if there is still data in the pattern space.
+If pattern space contains no newline, start a normal new cycle as if
+the d command was issued.  Otherwise, delete text in the pattern
+space up to the first newline, and restart cycle with the resultant
+pattern space, without reading a new line of input.
 .TP
 h H
 …
 Copy/append hold space to pattern space.
 .TP
+x
-Exchange the contents of the hold and pattern spaces.
-.TP
+l
 List out the current line in a ``visually unambiguous'' form.
+.TP
+.RI l\  width
+List out the current line in a ``visually unambiguous'' form,
+breaking it at
+.I width
+characters.  This is a GNU extension.
 .TP
 n N
 …
 .IR regexp .
 .TP
+.RI t\  label
+If a s/// has done a successful substitution since the
+last input line was read and since the last t or T
+command, then branch to
+.IR label ;
+if
+.I label
+is omitted, branch to end of script.
+.TP
+.RI T\  label
+If no s/// has done a successful substitution since the
+last input line was read and since the last t or T
+command, then branch to
+.IR label ;
+if
+.I label
+is omitted, branch to end of script.  This is a GNU
+extension.
+.TP
 .RI w\  filename
 Write the current pattern space to
 …
 Write the first line of the current pattern space to
 .IR filename .
+This is a GNU extension.
+.TP
+x
+Exchange the contents of the hold and pattern spaces.
 .TP
 .RI y/ source / dest /
 …
 .I number
 Match only the specified line
+.IR number .
+.IR number
+(which increments cumulatively across files, unless the
+.B \-s
+option is specified on the command line).
 .TP
 .IR first ~ step
 …
 line starting with line
 .IR first .
 For example, ``sed -n 1~2p'' will print all the odd-numbered lines in
+For example, ``sed \-n 1~2p'' will print all the odd-numbered lines in
 the input stream, and the address 2~5 will match every fifth line,
+starting with the second. (This is an extension.)
+starting with the second.
+.I first
+can be zero; in this case, \*(sd operates as if it were equal to
+.IR step .
+(This is an extension.)
 .TP
+$
 …
 Match lines matching the regular expression
 .IR regexp .
+Matching is performed on the current pattern space, which
+can be modified with commands such as ``s///''.
 .TP
 .BI \fR\e\fPc regexp c
 …
 .RI 1, addr2
 form will still be at the beginning of its range.
+This works only when
+.I addr2
+is a regular expression.
 .TP
 .IR addr1 ,+ N
 …
 .BR \et ,
 and other sequences.
+The \fI-E\fP option switches to using extended regular expressions instead;
+it has been supported for years by GNU sed, and is now
+included in POSIX.
 .SH BUGS
 .PP
 E-mail bug reports to
+.BR bonzini@gnu.org .
+Be sure to include the word ``sed'' somewhere in the ``Subject:'' field.
+Also, please include the output of ``sed --version'' in the body
+.BR bug-sed@gnu.org .
+Also, please include the output of ``sed \-\-version'' in the body
 of your report if at all possible.
+.SH AUTHOR
+Written by Jay Fenlason, Tom Lord, Ken Pizzini,
+Paolo Bonzini, Jim Meyering, and Assaf Gordon.
+.PP
+This sed program was built with SELinux support.
+SELinux is enabled on this system.
+.PP
+GNU sed home page: <https://www.gnu.org/software/sed/>.
+General help using GNU software: <https://www.gnu.org/gethelp/>.
+E\-mail bug reports to: <bug\-sed@gnu.org>.
 .SH COPYRIGHT
+Copyright \(co 2003 Free Software Foundation, Inc.
+Copyright \(co 2022 Free Software Foundation, Inc.
+License GPLv3+: GNU GPL version 3 or later <https://gnu.org/licenses/gpl.html>.
 .br
+This is free software; see the source for copying conditions.  There is NO
+warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE,
+to the extent permitted by law.
+This is free software: you are free to change and redistribute it.
+There is NO WARRANTY, to the extent permitted by law.
 .SH "SEE ALSO"
 .BR awk (1),

trunk/src/sed/doc/sed.info

-              r599
+              r3613
+This is ../../doc/sed.info, produced by makeinfo version 4.5 from
+../../doc/sed.texi.
+This is sed.info, produced by makeinfo version 6.8dev from sed.texi.
+This file documents version 4.9 of GNU âsedâ, a stream editor.
+   Copyright Â© 1998â2022 Free Software Foundation, Inc.
+     Permission is granted to copy, distribute and/or modify this
+     document under the terms of the GNU Free Documentation License,
+     Version 1.3 or any later version published by the Free Software
+     Foundation; with no Invariant Sections, no Front-Cover Texts, and
+     no Back-Cover Texts.  A copy of the license is included in the
+     section entitled âGNU Free Documentation Licenseâ.
 INFO-DIR-SECTION Text creation and manipulation
 START-INFO-DIR-ENTRY
 …
 END-INFO-DIR-ENTRY
+This file documents version 4.1.5 of GNU `sed', a stream editor.
+   Copyright (C) 1998, 1999, 2001, 2002, 2003, 2004 Free Software
+Foundation, Inc.
+   This document is released under the terms of the GNU Free
+Documentation License as published by the Free Software Foundation;
+either version 1.1, or (at your option) any later version.
+   You should have received a copy of the GNU Free Documentation
+License along with GNU `sed'; see the file `COPYING.DOC'.  If not,
+write to the Free Software Foundation, 59 Temple Place - Suite 330,
+Boston, MA 02110-1301, USA.
+   There are no Cover Texts and no Invariant Sections; this text, along
+with its equivalent in the printed manual, constitutes the Title Page.
+Indirect:
+sed.info-1: 935
+sed.info-2: 50405
+File: sed.info,  Node: Top,  Next: Introduction,  Up: (dir)
+GNU âsedâ
+*********
+This file documents version 4.9 of GNU âsedâ, a stream editor.
+   Copyright Â© 1998â2022 Free Software Foundation, Inc.
+     Permission is granted to copy, distribute and/or modify this
+     document under the terms of the GNU Free Documentation License,
+     Version 1.3 or any later version published by the Free Software
+     Foundation; with no Invariant Sections, no Front-Cover Texts, and
+     no Back-Cover Texts.  A copy of the license is included in the
+     section entitled âGNU Free Documentation Licenseâ.
+* Menu:
+* Introduction::               Introduction
+* Invoking sed::               Invocation
+* sed scripts::                âsedâ scripts
+* sed addresses::              Addresses: selecting lines
+* sed regular expressions::    Regular expressions: selecting text
+* advanced sed::               Advanced âsedâ: cycles and buffers
+* Examples::                   Some sample scripts
+* Limitations::                Limitations and (non-)limitations of GNU âsedâ
+* Other Resources::            Other resources for learning about âsedâ
+* Reporting Bugs::             Reporting bugs
+* GNU Free Documentation License:: Copying and sharing this manual
+* Concept Index::              A menu with all the topics in this manual.
+* Command and Option Index::   A menu with all âsedâ commands and
+                               command-line options.
+File: sed.info,  Node: Introduction,  Next: Invoking sed,  Prev: Top,  Up: Top
+Introduction
+**************
+âsedâ is a stream editor.  A stream editor is used to perform basic text
+transformations on an input stream (a file or input from a pipeline).
+While in some ways similar to an editor which permits scripted edits
+(such as âedâ), âsedâ works by making only one pass over the input(s),
+and is consequently more efficient.  But it is âsedââs ability to filter
+text in a pipeline which particularly distinguishes it from other types
+of editors.
+File: sed.info,  Node: Invoking sed,  Next: sed scripts,  Prev: Introduction,  Up: Top
+Running sed
+*************
+This chapter covers how to run âsedâ.  Details of âsedâ scripts and
+individual âsedâ commands are discussed in the next chapter.
+* Menu:
+* Overview::
+* Command-Line Options::
+* Exit status::
+File: sed.info,  Node: Overview,  Next: Command-Line Options,  Up: Invoking sed
+.1 Overview
+============
+Normally âsedâ is invoked like this:
+     sed SCRIPT INPUTFILE...
+   For example, to change every âhelloâ to âworldâ in the file
+âinput.txtâ:
+     sed 's/hello/world/g' input.txt > output.txt
+   Without the âgâ (global) modifier, âsedâ affects only the first
+instance per line.
+   If you do not specify INPUTFILE, or if INPUTFILE is â-â, âsedâ
+filters the contents of the standard input.  The following commands are
+equivalent:
+     sed 's/hello/world/g' input.txt > output.txt
+     sed 's/hello/world/g' < input.txt > output.txt
+     cat input.txt | sed 's/hello/world/g' - > output.txt
+   âsedâ writes output to standard output.  Use â-iâ to edit files
+in-place instead of printing to standard output.  See also the âWâ and
+âs///wâ commands for writing output to other files.  The following
+command modifies âfile.txtâ and does not produce any output:
+     sed -i 's/hello/world/' file.txt
+   By default âsedâ prints all processed input (except input that has
+been modified/deleted by commands such as âdâ).  Use â-nâ to suppress
+output, and the âpâ command to print specific lines.  The following
+command prints only line 45 of the input file:
+     sed -n '45p' file.txt
+   âsedâ treats multiple input files as one long stream.  The following
+example prints the first line of the first file (âone.txtâ) and the last
+line of the last file (âthree.txtâ).  Use â-sâ to reverse this behavior.
+     sed -n  '1p ; $p' one.txt two.txt three.txt
+   Without â-eâ or â-fâ options, âsedâ uses the first non-option
+parameter as the SCRIPT, and the following non-option parameters as
+input files.  If â-eâ or â-fâ options are used to specify a SCRIPT, all
+non-option parameters are taken as input files.  Options â-eâ and â-fâ
+can be combined, and can appear multiple times (in which case the final
+effective SCRIPT will be concatenation of all the individual SCRIPTs).
+   The following examples are equivalent:
+     sed 's/hello/world/' input.txt > output.txt
+     sed -e 's/hello/world/' input.txt > output.txt
+     sed --expression='s/hello/world/' input.txt > output.txt
+     echo 's/hello/world/' > myscript.sed
+     sed -f myscript.sed input.txt > output.txt
+     sed --file=myscript.sed input.txt > output.txt
+File: sed.info,  Node: Command-Line Options,  Next: Exit status,  Prev: Overview,  Up: Invoking sed
+.2 Command-Line Options
+========================
+The full format for invoking âsedâ is:
+     sed OPTIONS... [SCRIPT] [INPUTFILE...]
+   âsedâ may be invoked with the following command-line options:
+â--versionâ
+     Print out the version of âsedâ that is being run and a copyright
+     notice, then exit.
+â--helpâ
+     Print a usage message briefly summarizing these command-line
+     options and the bug-reporting address, then exit.
+â-nâ
+â--quietâ
+â--silentâ
+     By default, âsedâ prints out the pattern space at the end of each
+     cycle through the script (*note How âsedâ works: Execution Cycle.).
+     These options disable this automatic printing, and âsedâ only
+     produces output when explicitly told to via the âpâ command.
+â--debugâ
+     Print the input sed program in canonical form, and annotate program
+     execution.
+          $ echo 1 | sed '\%1%s21232'
+          $ echo 1 | sed --debug '\%1%s21232'
+          SED PROGRAM:
+            /1/ s/1/3/
+          INPUT:   'STDIN' line 1
+          PATTERN: 1
+          COMMAND: /1/ s/1/3/
+          PATTERN: 3
+          END-OF-CYCLE:
+â-e SCRIPTâ
+â--expression=SCRIPTâ
+     Add the commands in SCRIPT to the set of commands to be run while
+     processing the input.
+â-f SCRIPT-FILEâ
+â--file=SCRIPT-FILEâ
+     Add the commands contained in the file SCRIPT-FILE to the set of
+     commands to be run while processing the input.
+â-i[SUFFIX]â
+â--in-place[=SUFFIX]â
+     This option specifies that files are to be edited in-place.  GNU
+     âsedâ does this by creating a temporary file and sending output to
+     this file rather than to the standard output.(1).
+     This option implies â-sâ.
+     When the end of the file is reached, the temporary file is renamed
+     to the output fileâs original name.  The extension, if supplied, is
+     used to modify the name of the old file before renaming the
+     temporary file, thereby making a backup copy(2)).
+     This rule is followed: if the extension doesnât contain a â*â, then
+     it is appended to the end of the current filename as a suffix; if
+     the extension does contain one or more â*â characters, then _each_
+     asterisk is replaced with the current filename.  This allows you to
+     add a prefix to the backup file, instead of (or in addition to) a
+     suffix, or even to place backup copies of the original files into
+     another directory (provided the directory already exists).
+     If no extension is supplied, the original file is overwritten
+     without making a backup.
+     Because â-iâ takes an optional argument, it should not be followed
+     by other short options:
+     âsed -Ei '...' FILEâ
+          Same as â-E -iâ with no backup suffix - âFILEâ will be edited
+          in-place without creating a backup.
+     âsed -iE '...' FILEâ
+          This is equivalent to â--in-place=Eâ, creating âFILEEâ as
+          backup of âFILEâ
+     Be cautious of using â-nâ with â-iâ: the former disables automatic
+     printing of lines and the latter changes the file in-place without
+     a backup.  Used carelessly (and without an explicit âpâ command),
+     the output file will be empty:
+          # WRONG USAGE: 'FILE' will be truncated.
+          sed -ni 's/foo/bar/' FILE
+â-l Nâ
+â--line-length=Nâ
+     Specify the default line-wrap length for the âlâ command.  A length
+     of 0 (zero) means to never wrap long lines.  If not specified, it
+     is taken to be 70.
+â--posixâ
+     GNU âsedâ includes several extensions to POSIX sed.  In order to
+     simplify writing portable scripts, this option disables all the
+     extensions that this manual documents, including additional
+     commands.  Most of the extensions accept âsedâ programs that are
+     outside the syntax mandated by POSIX, but some of them (such as the
+     behavior of the âNâ command described in *note Reporting Bugs::)
+     actually violate the standard.  If you want to disable only the
+     latter kind of extension, you can set the âPOSIXLY_CORRECTâ
+     variable to a non-empty value.
+â-bâ
+â--binaryâ
+     This option is available on every platform, but is only effective
+     where the operating system makes a distinction between text files
+     and binary files.  When such a distinction is madeâas is the case
+     for MS-DOS, Windows, Cygwinâtext files are composed of lines
+     separated by a carriage return _and_ a line feed character, and
+     âsedâ does not see the ending CR. When this option is specified,
+     âsedâ will open input files in binary mode, thus not requesting
+     this special processing and considering lines to end at a line
+     feed.
+â--follow-symlinksâ
+     This option is available only on platforms that support symbolic
+     links and has an effect only if option â-iâ is specified.  In this
+     case, if the file that is specified on the command line is a
+     symbolic link, âsedâ will follow the link and edit the ultimate
+     destination of the link.  The default behavior is to break the
+     symbolic link, so that the link destination will not be modified.
+â-Eâ
+â-râ
+â--regexp-extendedâ
+     Use extended regular expressions rather than basic regular
+     expressions.  Extended regexps are those that âegrepâ accepts; they
+     can be clearer because they usually have fewer backslashes.
+     Historically this was a GNU extension, but the â-Eâ extension has
+     since been added to the POSIX standard
+     (http://austingroupbugs.net/view.php?id=528), so use â-Eâ for
+     portability.  GNU sed has accepted â-Eâ as an undocumented option
+     for years, and *BSD seds have accepted â-Eâ for years as well, but
+     scripts that use â-Eâ might not port to other older systems.  *Note
+     Extended regular expressions: ERE syntax.
+â-sâ
+â--separateâ
+     By default, âsedâ will consider the files specified on the command
+     line as a single continuous long stream.  This GNU âsedâ extension
+     allows the user to consider them as separate files: range addresses
+     (such as â/abc/,/def/â) are not allowed to span several files, line
+     numbers are relative to the start of each file, â$â refers to the
+     last line of each file, and files invoked from the âRâ commands are
+     rewound at the start of each file.
+â--sandboxâ
+     In sandbox mode, âe/w/râ commands are rejected - programs
+     containing them will be aborted without being run.  Sandbox mode
+     ensures âsedâ operates only on the input files designated on the
+     command line, and cannot run external programs.
+â-uâ
+â--unbufferedâ
+     Buffer both input and output as minimally as practical.  (This is
+     particularly useful if the input is coming from the likes of âtail
+     -fâ, and you wish to see the transformed output as soon as
+     possible.)
+â-zâ
+â--null-dataâ
+â--zero-terminatedâ
+     Treat the input as a set of lines, each terminated by a zero byte
+     (the ASCII âNULâ character) instead of a newline.  This option can
+     be used with commands like âsort -zâ and âfind -print0â to process
+     arbitrary file names.
+   If no â-eâ, â-fâ, â--expressionâ, or â--fileâ options are given on
+the command-line, then the first non-option argument on the command line
+is taken to be the SCRIPT to be executed.
+   If any command-line parameters remain after processing the above,
+these parameters are interpreted as the names of input files to be
+processed.  A file name of â-â refers to the standard input stream.  The
+standard input will be processed if no file names are specified.
+   ---------- Footnotes ----------
+   (1) This applies to commands such as â=â, âaâ, âcâ, âiâ, âlâ, âpâ.
+You can still write to the standard output by using the âwâ or âWâ
+commands together with the â/dev/stdoutâ special file
+   (2) Note that GNU âsedâ creates the backup file whether or not any
+output is actually changed.
+File: sed.info,  Node: Exit status,  Prev: Command-Line Options,  Up: Invoking sed
+.3 Exit status
+===============
+An exit status of zero indicates success, and a nonzero value indicates
+failure.  GNU âsedâ returns the following exit status error values:
+     Successful completion.
+     Invalid command, invalid syntax, invalid regular expression or a
+     GNU âsedâ extension command used with â--posixâ.
+     One or more of the input file specified on the command line could
+     not be opened (e.g.  if a file is not found, or read permission is
+     denied).  Processing continued with other files.
+     An I/O error, or a serious processing error during runtime, GNU
+     âsedâ aborted immediately.
+   Additionally, the commands âqâ and âQâ can be used to terminate âsedâ
+with a custom exit code value (this is a GNU âsedâ extension):
+     $ echo | sed 'Q42' ; echo $?
+File: sed.info,  Node: sed scripts,  Next: sed addresses,  Prev: Invoking sed,  Up: Top
+âsedâ scripts
+***************
+* Menu:
+* sed script overview::      âsedâ script overview
+* sed commands list::        âsedâ commands summary
+* The "s" Command::          âsedââs Swiss Army Knife
+* Common Commands::          Often used commands
+* Other Commands::           Less frequently used commands
+* Programming Commands::     Commands for âsedâ gurus
+* Extended Commands::        Commands specific of GNU âsedâ
+* Multiple commands syntax:: Extension for easier scripting
+File: sed.info,  Node: sed script overview,  Next: sed commands list,  Up: sed scripts
+.1 âsedâ script overview
+=========================
+A âsedâ program consists of one or more âsedâ commands, passed in by one
+or more of the â-eâ, â-fâ, â--expressionâ, and â--fileâ options, or the
+first non-option argument if zero of these options are used.  This
+document will refer to âtheâ âsedâ script; this is understood to mean
+the in-order concatenation of all of the SCRIPTs and SCRIPT-FILEs passed
+in.  *Note Overview::.
+   âsedâ commands follow this syntax:
+     [addr]X[options]
+   X is a single-letter âsedâ command.  â[addr]â is an optional line
+address.  If â[addr]â is specified, the command X will be executed only
+on the matched lines.  â[addr]â can be a single line number, a regular
+expression, or a range of lines (*note sed addresses::).  Additional
+â[options]â are used for some âsedâ commands.
+   The following example deletes lines 30 to 35 in the input.  â30,35â
+is an address range.  âdâ is the delete command:
+     sed '30,35d' input.txt > output.txt
+   The following example prints all input until a line starting with the
+string âfooâ is found.  If such line is found, âsedâ will terminate with
+exit status 42.  If such line was not found (and no other error
+occurred), âsedâ will exit with status 0.  â/^foo/â is a
+regular-expression address.  âqâ is the quit command.  â42â is the
+command option.
+     sed '/^foo/q42' input.txt > output.txt
+   Commands within a SCRIPT or SCRIPT-FILE can be separated by
+semicolons (â;â) or newlines (ASCII 10).  Multiple scripts can be
+specified with â-eâ or â-fâ options.
+   The following examples are all equivalent.  They perform two âsedâ
+operations: deleting any lines matching the regular expression â/^foo/â,
+and replacing all occurrences of the string âhelloâ with âworldâ:
+     sed '/^foo/d ; s/hello/world/g' input.txt > output.txt
+     sed -e '/^foo/d' -e 's/hello/world/g' input.txt > output.txt
+     echo '/^foo/d' > script.sed
+     echo 's/hello/world/g' >> script.sed
+     sed -f script.sed input.txt > output.txt
+     echo 's/hello/world/g' > script2.sed
+     sed -e '/^foo/d' -f script2.sed input.txt > output.txt
+   Commands âaâ, âcâ, âiâ, due to their syntax, cannot be followed by
+semicolons working as command separators and thus should be terminated
+with newlines or be placed at the end of a SCRIPT or SCRIPT-FILE.
+Commands can also be preceded with optional non-significant whitespace
+characters.  *Note Multiple commands syntax::.
+File: sed.info,  Node: sed commands list,  Next: The "s" Command,  Prev: sed script overview,  Up: sed scripts
+.2 âsedâ commands summary
+==========================
+The following commands are supported in GNU âsedâ.  Some are standard
+POSIX commands, while other are GNU extensions.  Details and examples
+for each command are in the following sections.  (Mnemonics) are shown
+in parentheses.
+âa\â
+âTEXTâ
+     Append TEXT after a line.
+âa TEXTâ
+     Append TEXT after a line (alternative syntax).
+âb LABELâ
+     Branch unconditionally to LABEL.  The LABEL may be omitted, in
+     which case the next cycle is started.
+âc\â
+âTEXTâ
+     Replace (change) lines with TEXT.
+âc TEXTâ
+     Replace (change) lines with TEXT (alternative syntax).
+âdâ
+     Delete the pattern space; immediately start next cycle.
+âDâ
+     If pattern space contains newlines, delete text in the pattern
+     space up to the first newline, and restart cycle with the resultant
+     pattern space, without reading a new line of input.
+     If pattern space contains no newline, start a normal new cycle as
+     if the âdâ command was issued.
+âeâ
+     Executes the command that is found in pattern space and replaces
+     the pattern space with the output; a trailing newline is
+     suppressed.
+âe COMMANDâ
+     Executes COMMAND and sends its output to the output stream.  The
+     command can run across multiple lines, all but the last ending with
+     a back-slash.
+âFâ
+     (filename) Print the file name of the current input file (with a
+     trailing newline).
+âgâ
+     Replace the contents of the pattern space with the contents of the
+     hold space.
+âGâ
+     Append a newline to the contents of the pattern space, and then
+     append the contents of the hold space to that of the pattern space.
+âhâ
+     (hold) Replace the contents of the hold space with the contents of
+     the pattern space.
+âHâ
+     Append a newline to the contents of the hold space, and then append
+     the contents of the pattern space to that of the hold space.
+âi\â
+âTEXTâ
+     insert TEXT before a line.
+âi TEXTâ
+     insert TEXT before a line (alternative syntax).
+âlâ
+     Print the pattern space in an unambiguous form.
+ânâ
+     (next) If auto-print is not disabled, print the pattern space,
+     then, regardless, replace the pattern space with the next line of
+     input.  If there is no more input then âsedâ exits without
+     processing any more commands.
+âNâ
+     Add a newline to the pattern space, then append the next line of
+     input to the pattern space.  If there is no more input then âsedâ
+     exits without processing any more commands.
+âpâ
+     Print the pattern space.
+âPâ
+     Print the pattern space, up to the first <newline>.
+âq[EXIT-CODE]â
+     (quit) Exit âsedâ without processing any more commands or input.
+âQ[EXIT-CODE]â
+     (quit) This command is the same as âqâ, but will not print the
+     contents of pattern space.  Like âqâ, it provides the ability to
+     return an exit code to the caller.
+âr filenameâ
+     Reads file FILENAME.
+âR filenameâ
+     Queue a line of FILENAME to be read and inserted into the output
+     stream at the end of the current cycle, or when the next input line
+     is read.
+âs/REGEXP/REPLACEMENT/[FLAGS]â
+     (substitute) Match the regular-expression against the content of
+     the pattern space.  If found, replace matched string with
+     REPLACEMENT.
+ât LABELâ
+     (test) Branch to LABEL only if there has been a successful
+     âsâubstitution since the last input line was read or conditional
+     branch was taken.  The LABEL may be omitted, in which case the next
+     cycle is started.
+âT LABELâ
+     (test) Branch to LABEL only if there have been no successful
+     âsâubstitutions since the last input line was read or conditional
+     branch was taken.  The LABEL may be omitted, in which case the next
+     cycle is started.
+âv [VERSION]â
+     (version) This command does nothing, but makes âsedâ fail if GNU
+     âsedâ extensions are not supported, or if the requested version is
+     not available.
+âw filenameâ
+     Write the pattern space to FILENAME.
+âW filenameâ
+     Write to the given filename the portion of the pattern space up to
+     the first newline
+âxâ
+     Exchange the contents of the hold and pattern spaces.
+ây/src/dst/â
+     Transliterate any characters in the pattern space which match any
+     of the SOURCE-CHARS with the corresponding character in DEST-CHARS.
+âzâ
+     (zap) This command empties the content of pattern space.
+â#â
+     A comment, until the next newline.
+â{ CMD ; CMD ... }â
+     Group several commands together.
+â=â
+     Print the current input line number (with a trailing newline).
+â: LABELâ
+     Specify the location of LABEL for branch commands (âbâ, âtâ, âTâ).
+File: sed.info,  Node: The "s" Command,  Next: Common Commands,  Prev: sed commands list,  Up: sed scripts
+.3 The âsâ Command
+===================
+The âsâ command (as in substitute) is probably the most important in
+âsedâ and has a lot of different options.  The syntax of the âsâ command
+is âs/REGEXP/REPLACEMENT/FLAGSâ.
+   Its basic concept is simple: the âsâ command attempts to match the
+pattern space against the supplied regular expression REGEXP; if the
+match is successful, then that portion of the pattern space which was
+matched is replaced with REPLACEMENT.
+   For details about REGEXP syntax *note Regular Expression Addresses:
+Regexp Addresses.
+   The REPLACEMENT can contain â\Nâ (N being a number from 1 to 9,
+inclusive) references, which refer to the portion of the match which is
+contained between the Nth â\(â and its matching â\)â.  Also, the
+REPLACEMENT can contain unescaped â&â characters which reference the
+whole matched portion of the pattern space.
+   The â/â characters may be uniformly replaced by any other single
+character within any given âsâ command.  The â/â character (or whatever
+other character is used in its stead) can appear in the REGEXP or
+REPLACEMENT only if it is preceded by a â\â character.
+   Finally, as a GNU âsedâ extension, you can include a special sequence
+made of a backslash and one of the letters âLâ, âlâ, âUâ, âuâ, or âEâ.
+The meaning is as follows:
+â\Lâ
+     Turn the replacement to lowercase until a â\Uâ or â\Eâ is found,
+â\lâ
+     Turn the next character to lowercase,
+â\Uâ
+     Turn the replacement to uppercase until a â\Lâ or â\Eâ is found,
+â\uâ
+     Turn the next character to uppercase,
+â\Eâ
+     Stop case conversion started by â\Lâ or â\Uâ.
+   When the âgâ flag is being used, case conversion does not propagate
+from one occurrence of the regular expression to another.  For example,
+when the following command is executed with âa-b-â in pattern space:
+     s/\(b\?\)-/x\u\1/g
+the output is âaxxBâ.  When replacing the first â-â, the â\uâ sequence
+only affects the empty replacement of â\1â.  It does not affect the âxâ
+character that is added to pattern space when replacing âb-â with âxBâ.
+   On the other hand, â\lâ and â\uâ do affect the remainder of the
+replacement text if they are followed by an empty substitution.  With
+âa-b-â in pattern space, the following command:
+     s/\(b\?\)-/\u\1x/g
+will replace â-â with âXâ (uppercase) and âb-â with âBxâ.  If this
+behavior is undesirable, you can prevent it by adding a â\Eâ
+sequenceâafter â\1â in this case.
+   To include a literal â\â, â&â, or newline in the final replacement,
+be sure to precede the desired â\â, â&â, or newline in the REPLACEMENT
+with a â\â.
+   The âsâ command can be followed by zero or more of the following
+FLAGS:
+âgâ
+     Apply the replacement to _all_ matches to the REGEXP, not just the
+     first.
+âNUMBERâ
+     Only replace the NUMBERth match of the REGEXP.
+     interaction in âsâ command Note: the POSIX standard does not
+     specify what should happen when you mix the âgâ and NUMBER
+     modifiers, and currently there is no widely agreed upon meaning
+     across âsedâ implementations.  For GNU âsedâ, the interaction is
+     defined to be: ignore matches before the NUMBERth, and then match
+     and replace all matches from the NUMBERth on.
+âpâ
+     If the substitution was made, then print the new pattern space.
+     Note: when both the âpâ and âeâ options are specified, the relative
+     ordering of the two produces very different results.  In general,
+     âepâ (evaluate then print) is what you want, but operating the
+     other way round can be useful for debugging.  For this reason, the
+     current version of GNU âsedâ interprets specially the presence of
+     âpâ options both before and after âeâ, printing the pattern space
+     before and after evaluation, while in general flags for the âsâ
+     command show their effect just once.  This behavior, although
+     documented, might change in future versions.
+âw FILENAMEâ
+     If the substitution was made, then write out the result to the
+     named file.  As a GNU âsedâ extension, two special values of
+     FILENAME are supported: â/dev/stderrâ, which writes the result to
+     the standard error, and â/dev/stdoutâ, which writes to the standard
+     output.(1)
+âeâ
+     This command allows one to pipe input from a shell command into
+     pattern space.  If a substitution was made, the command that is
+     found in pattern space is executed and pattern space is replaced
+     with its output.  A trailing newline is suppressed; results are
+     undefined if the command to be executed contains a NUL character.
+     This is a GNU âsedâ extension.
+âIâ
+âiâ
+     The âIâ modifier to regular-expression matching is a GNU extension
+     which makes âsedâ match REGEXP in a case-insensitive manner.
+âMâ
+âmâ
+     The âMâ modifier to regular-expression matching is a GNU âsedâ
+     extension which directs GNU âsedâ to match the regular expression
+     in âmulti-lineâ mode.  The modifier causes â^â and â$â to match
+     respectively (in addition to the normal behavior) the empty string
+     after a newline, and the empty string before a newline.  There are
+     special character sequences (â\`â and â\'â) which always match the
+     beginning or the end of the buffer.  In addition, the period
+     character does not match a new-line character in multi-line mode.
+   ---------- Footnotes ----------
+   (1) This is equivalent to âpâ unless the â-iâ option is being used.
+File: sed.info,  Node: Common Commands,  Next: Other Commands,  Prev: The "s" Command,  Up: sed scripts
+.4 Often-Used Commands
+=======================
+If you use âsedâ at all, you will quite likely want to know these
+commands.
+â#â
+     [No addresses allowed.]
+     The â#â character begins a comment; the comment continues until the
+     next newline.
+     If you are concerned about portability, be aware that some
+     implementations of âsedâ (which are not POSIX conforming) may only
+     support a single one-line comment, and then only when the very
+     first character of the script is a â#â.
+     Warning: if the first two characters of the âsedâ script are â#nâ,
+     then the â-nâ (no-autoprint) option is forced.  If you want to put
+     a comment in the first line of your script and that comment begins
+     with the letter ânâ and you do not want this behavior, then be sure
+     to either use a capital âNâ, or place at least one space before the
+     ânâ.
+âq [EXIT-CODE]â
+     Exit âsedâ without processing any more commands or input.
+     Example: stop after printing the second line:
+          $ seq 3 | sed 2q
+     This command accepts only one address.  Note that the current
+     pattern space is printed if auto-print is not disabled with the
+     â-nâ options.  The ability to return an exit code from the âsedâ
+     script is a GNU âsedâ extension.
+     See also the GNU âsedâ extension âQâ command which quits silently
+     without printing the current pattern space.
+âdâ
+     Delete the pattern space; immediately start next cycle.
+     Example: delete the second input line:
+          $ seq 3 | sed 2d
+âpâ
+     Print out the pattern space (to the standard output).  This command
+     is usually only used in conjunction with the â-nâ command-line
+     option.
+     Example: print only the second input line:
+          $ seq 3 | sed -n 2p
+ânâ
+     If auto-print is not disabled, print the pattern space, then,
+     regardless, replace the pattern space with the next line of input.
+     If there is no more input then âsedâ exits without processing any
+     more commands.
+     This command is useful to skip lines (e.g.  process every Nth
+     line).
+     Example: perform substitution on every 3rd line (i.e.  two ânâ
+     commands skip two lines):
+          $ seq 6 | sed 'n;n;s/./x/'
+          x
+          x
+     GNU âsedâ provides an extension address syntax of FIRST~STEP to
+     achieve the same result:
+          $ seq 6 | sed '0~3s/./x/'
+          x
+          x
+â{ COMMANDS }â
+     A group of commands may be enclosed between â{â and â}â characters.
+     This is particularly useful when you want a group of commands to be
+     triggered by a single address (or address-range) match.
+     Example: perform substitution then print the second input line:
+          $ seq 3 | sed -n '2{s/2/X/ ; p}'
+          X
+File: sed.info,  Node: Other Commands,  Next: Programming Commands,  Prev: Common Commands,  Up: sed scripts
+.5 Less Frequently-Used Commands
+=================================
+Though perhaps less frequently used than those in the previous section,
+some very small yet useful âsedâ scripts can be built with these
+commands.
+ây/SOURCE-CHARS/DEST-CHARS/â
+     Transliterate any characters in the pattern space which match any
+     of the SOURCE-CHARS with the corresponding character in DEST-CHARS.
+     Example: transliterate âa-jâ into â0-9â:
+          $ echo hello world | sed 'y/abcdefghij/0123456789/'
+llo worl3
+     (The â/â characters may be uniformly replaced by any other single
+     character within any given âyâ command.)
+     Instances of the â/â (or whatever other character is used in its
+     stead), â\â, or newlines can appear in the SOURCE-CHARS or
+     DEST-CHARS lists, provide that each instance is escaped by a â\â.
+     The SOURCE-CHARS and DEST-CHARS lists _must_ contain the same
+     number of characters (after de-escaping).
+     See the âtrâ command from GNU coreutils for similar functionality.
+âa TEXTâ
+     Appending TEXT after a line.  This is a GNU extension to the
+     standard âaâ command - see below for details.
+     Example: Add âhelloâ after the second line:
+          $ seq 3 | sed '2a hello'
+          hello
+     Leading whitespace after the âaâ command is ignored.  The text to
+     add is read until the end of the line.
+âa\â
+âTEXTâ
+     Appending TEXT after a line.
+     Example: Add âhelloâ after the second line (â£ indicates printed
+     output lines):
+          $ seq 3 | sed '2a\
+          hello'
+          â£1
+          â£2
+          â£hello
+          â£3
+     The âaâ command queues the lines of text which follow this command
+     (each but the last ending with a â\â, which are removed from the
+     output) to be output at the end of the current cycle, or when the
+     next input line is read.
+     As a GNU extension, this command accepts two addresses.
+     Escape sequences in TEXT are processed, so you should use â\\â in
+     TEXT to print a single backslash.
+     The commands resume after the last line without a backslash (â\â) -
+     âworldâ in the following example:
+          $ seq 3 | sed '2a\
+          hello\
+          world
+s/./X/'
+          â£1
+          â£2
+          â£hello
+          â£world
+          â£X
+     As a GNU extension, the âaâ command and TEXT can be separated into
+     two â-eâ parameters, enabling easier scripting:
+          $ seq 3 | sed -e '2a\' -e hello
+          hello
+          $ sed -e '2a\' -e "$VAR"
+âi TEXTâ
+     insert TEXT before a line.  This is a GNU extension to the standard
+     âiâ command - see below for details.
+     Example: Insert âhelloâ before the second line:
+          $ seq 3 | sed '2i hello'
+          hello
+     Leading whitespace after the âiâ command is ignored.  The text to
+     add is read until the end of the line.
+âi\â
+âTEXTâ
+     Immediately output the lines of text which follow this command.
+     Example: Insert âhelloâ before the second line (â£ indicates printed
+     output lines):
+          $ seq 3 | sed '2i\
+          hello'
+          â£1
+          â£hello
+          â£2
+          â£3
+     As a GNU extension, this command accepts two addresses.
+     Escape sequences in TEXT are processed, so you should use â\\â in
+     TEXT to print a single backslash.
+     The commands resume after the last line without a backslash (â\â) -
+     âworldâ in the following example:
+          $ seq 3 | sed '2i\
+          hello\
+          world
+          s/./X/'
+          â£X
+          â£hello
+          â£world
+          â£X
+          â£X
+     As a GNU extension, the âiâ command and TEXT can be separated into
+     two â-eâ parameters, enabling easier scripting:
+          $ seq 3 | sed -e '2i\' -e hello
+          hello
+          $ sed -e '2i\' -e "$VAR"
+âc TEXTâ
+     Replaces the line(s) with TEXT.  This is a GNU extension to the
+     standard âcâ command - see below for details.
+     Example: Replace the 2nd to 9th lines with the word âhelloâ:
+          $ seq 10 | sed '2,9c hello'
+          hello
+     Leading whitespace after the âcâ command is ignored.  The text to
+     add is read until the end of the line.
+âc\â
+âTEXTâ
+     Delete the lines matching the address or address-range, and output
+     the lines of text which follow this command.
+     Example: Replace 2nd to 4th lines with the words âhelloâ and
+     âworldâ (â£ indicates printed output lines):
+          $ seq 5 | sed '2,4c\
+          hello\
+          world'
+          â£1
+          â£hello
+          â£world
+          â£5
+     If no addresses are given, each line is replaced.
+     A new cycle is started after this command is done, since the
+     pattern space will have been deleted.  In the following example,
+     the âcâ starts a new cycle and the substitution command is not
+     performed on the replaced text:
+          $ seq 3 | sed '2c\
+          hello
+          s/./X/'
+          â£X
+          â£hello
+          â£X
+     As a GNU extension, the âcâ command and TEXT can be separated into
+     two â-eâ parameters, enabling easier scripting:
+          $ seq 3 | sed -e '2c\' -e hello
+          hello
+          $ sed -e '2c\' -e "$VAR"
+â=â
+     Print out the current input line number (with a trailing newline).
+          $ printf '%s\n' aaa bbb ccc | sed =
+          aaa
+          bbb
+          ccc
+     As a GNU extension, this command accepts two addresses.
+âl Nâ
+     Print the pattern space in an unambiguous form: non-printable
+     characters (and the â\â character) are printed in C-style escaped
+     form; long lines are split, with a trailing â\â character to
+     indicate the split; the end of each line is marked with a â$â.
+     N specifies the desired line-wrap length; a length of 0 (zero)
+     means to never wrap long lines.  If omitted, the default as
+     specified on the command line is used.  The N parameter is a GNU
+     âsedâ extension.
+âr FILENAMEâ
+     Reads file FILENAME.  Example:
+          $ seq 3 | sed '2r/etc/hostname'
+          fencepost.gnu.org
+     Queue the contents of FILENAME to be read and inserted into the
+     output stream at the end of the current cycle, or when the next
+     input line is read.  Note that if FILENAME cannot be read, it is
+     treated as if it were an empty file, without any error indication.
+     As a GNU âsedâ extension, the special value â/dev/stdinâ is
+     supported for the file name, which reads the contents of the
+     standard input.
+     As a GNU extension, this command accepts two addresses.  The file
+     will then be reread and inserted on each of the addressed lines.
+     As a GNU âsedâ extension, the ârâ command accepts a zero address,
+     inserting a file _before_ the first line of the input *note Adding
+     a header to multiple files::.
+âw FILENAMEâ
+     Write the pattern space to FILENAME.  As a GNU âsedâ extension, two
+     special values of FILENAME are supported: â/dev/stderrâ, which
+     writes the result to the standard error, and â/dev/stdoutâ, which
+     writes to the standard output.(1)
+     The file will be created (or truncated) before the first input line
+     is read; all âwâ commands (including instances of the âwâ flag on
+     successful âsâ commands) which refer to the same FILENAME are
+     output without closing and reopening the file.
+âDâ
+     If pattern space contains no newline, start a normal new cycle as
+     if the âdâ command was issued.  Otherwise, delete text in the
+     pattern space up to the first newline, and restart cycle with the
+     resultant pattern space, without reading a new line of input.
+âNâ
+     Add a newline to the pattern space, then append the next line of
+     input to the pattern space.  If there is no more input then âsedâ
+     exits without processing any more commands.
+     When â-zâ is used, a zero byte (the ascii âNULâ character) is added
+     between the lines (instead of a new line).
+     By default âsedâ does not terminate if there is no ânextâ input
+     line.  This is a GNU extension which can be disabled with
+     â--posixâ.  *Note N command on the last line: N_command_last_line.
+âPâ
+     Print out the portion of the pattern space up to the first newline.
+âhâ
+     Replace the contents of the hold space with the contents of the
+     pattern space.
+âHâ
+     Append a newline to the contents of the hold space, and then append
+     the contents of the pattern space to that of the hold space.
+âgâ
+     Replace the contents of the pattern space with the contents of the
+     hold space.
+âGâ
+     Append a newline to the contents of the pattern space, and then
+     append the contents of the hold space to that of the pattern space.
+âxâ
+     Exchange the contents of the hold and pattern spaces.
+   ---------- Footnotes ----------
+   (1) This is equivalent to âpâ unless the â-iâ option is being used.
+File: sed.info,  Node: Programming Commands,  Next: Extended Commands,  Prev: Other Commands,  Up: sed scripts
+.6 Commands for âsedâ gurus
+============================
+In most cases, use of these commands indicates that you are probably
+better off programming in something like âawkâ or Perl.  But
+occasionally one is committed to sticking with âsedâ, and these commands
+can enable one to write quite convoluted scripts.
+â: LABELâ
+     [No addresses allowed.]
+     Specify the location of LABEL for branch commands.  In all other
+     respects, a no-op.
+âb LABELâ
+     Unconditionally branch to LABEL.  The LABEL may be omitted, in
+     which case the next cycle is started.
+ât LABELâ
+     Branch to LABEL only if there has been a successful âsâubstitution
+     since the last input line was read or conditional branch was taken.
+     The LABEL may be omitted, in which case the next cycle is started.
+File: sed.info,  Node: Extended Commands,  Next: Multiple commands syntax,  Prev: Programming Commands,  Up: sed scripts
+.7 Commands Specific to GNU âsedâ
+==================================
+These commands are specific to GNU âsedâ, so you must use them with care
+and only when you are sure that hindering portability is not evil.  They
+allow you to check for GNU âsedâ extensions or to do tasks that are
+required quite often, yet are unsupported by standard âsedâs.
+âe [COMMAND]â
+     This command allows one to pipe input from a shell command into
+     pattern space.  Without parameters, the âeâ command executes the
+     command that is found in pattern space and replaces the pattern
+     space with the output; a trailing newline is suppressed.
+     If a parameter is specified, instead, the âeâ command interprets it
+     as a command and sends its output to the output stream.  The
+     command can run across multiple lines, all but the last ending with
+     a back-slash.
+     In both cases, the results are undefined if the command to be
+     executed contains a NUL character.
+     Note that, unlike the ârâ command, the output of the command will
+     be printed immediately; the ârâ command instead delays the output
+     to the end of the current cycle.
+âFâ
+     Print out the file name of the current input file (with a trailing
+     newline).
+âQ [EXIT-CODE]â
+     This command accepts only one address.
+     This command is the same as âqâ, but will not print the contents of
+     pattern space.  Like âqâ, it provides the ability to return an exit
+     code to the caller.
+     This command can be useful because the only alternative ways to
+     accomplish this apparently trivial function are to use the â-nâ
+     option (which can unnecessarily complicate your script) or
+     resorting to the following snippet, which wastes time by reading
+     the whole file without any visible effect:
+          :eat
+          $d       Quit silently on the last line
+          N        Read another line, silently
+          g        Overwrite pattern space each time to save memory
+          b eat
+âR FILENAMEâ
+     Queue a line of FILENAME to be read and inserted into the output
+     stream at the end of the current cycle, or when the next input line
+     is read.  Note that if FILENAME cannot be read, or if its end is
+     reached, no line is appended, without any error indication.
+     As with the ârâ command, the special value â/dev/stdinâ is
+     supported for the file name, which reads a line from the standard
+     input.
+âT LABELâ
+     Branch to LABEL only if there have been no successful
+     âsâubstitutions since the last input line was read or conditional
+     branch was taken.  The LABEL may be omitted, in which case the next
+     cycle is started.
+âv VERSIONâ
+     This command does nothing, but makes âsedâ fail if GNU âsedâ
+     extensions are not supported, simply because other versions of
+     âsedâ do not implement it.  In addition, you can specify the
+     version of âsedâ that your script requires, such as â4.0.5â.  The
+     default is â4.0â because that is the first version that implemented
+     this command.
+     This command enables all GNU extensions even if âPOSIXLY_CORRECTâ
+     is set in the environment.
+âW FILENAMEâ
+     Write to the given filename the portion of the pattern space up to
+     the first newline.  Everything said under the âwâ command about
+     file handling holds here too.
+âzâ
+     This command empties the content of pattern space.  It is usually
+     the same as âs/.*//â, but is more efficient and works in the
+     presence of invalid multibyte sequences in the input stream.  POSIX
+     mandates that such sequences are _not_ matched by â.â, so that
+     there is no portable way to clear âsedââs buffers in the middle of
+     the script in most multibyte locales (including UTF-8 locales).
+File: sed.info,  Node: Multiple commands syntax,  Prev: Extended Commands,  Up: sed scripts
+.8 Multiple commands syntax
+============================
+There are several methods to specify multiple commands in a âsedâ
+program.
+   Using newlines is most natural when running a sed script from a file
+(using the â-fâ option).
+   On the command line, all âsedâ commands may be separated by newlines.
+Alternatively, you may specify each command as an argument to an â-eâ
+option:
+     $ seq 6 | sed '1d
+d
+d'
+     $ seq 6 | sed -e 1d -e 3d -e 5d
+   A semicolon (â;â) may be used to separate most simple commands:
+     $ seq 6 | sed '1d;3d;5d'
+   The â{â,â}â,âbâ,âtâ,âTâ,â:â commands can be separated with a
+semicolon (this is a non-portable GNU âsedâ extension).
+     $ seq 4 | sed '{1d;3d}'
+     $ seq 6 | sed '{1d;3d};5d'
+   Labels used in âbâ,âtâ,âTâ,â:â commands are read until a semicolon.
+Leading and trailing whitespace is ignored.  In the examples below the
+label is âxâ.  The first example works with GNU âsedâ.  The second is a
+portable equivalent.  For more information about branching and labels
+*note Branching and flow control::.
+     $ seq 3 | sed '/1/b x ; s/^/=/ ; :x ; 3d'
+     =2
+     $ seq 3 | sed -e '/1/bx' -e 's/^/=/' -e ':x' -e '3d'
+     =2
+.8.1 Commands Requiring a newline
+----------------------------------
+The following commands cannot be separated by a semicolon and require a
+newline:
+âaâ,âcâ,âiâ (append/change/insert)
+     All characters following âaâ,âcâ,âiâ commands are taken as the text
+     to append/change/insert.  Using a semicolon leads to undesirable
+     results:
+          $ seq 2 | sed '1aHello ; 2d'
+          Hello ; 2d
+     Separate the commands using â-eâ or a newline:
+          $ seq 2 | sed -e 1aHello -e 2d
+          Hello
+          $ seq 2 | sed '1aHello
+d'
+          Hello
+     Note that specifying the text to add (âHelloâ) immediately after
+     âaâ,âcâ,âiâ is itself a GNU âsedâ extension.  A portable,
+     POSIX-compliant alternative is:
+          $ seq 2 | sed '1a\
+          Hello
+d'
+          Hello
+â#â (comment)
+     All characters following â#â until the next newline are ignored.
+          $ seq 3 | sed '# this is a comment ; 2d'
+          $ seq 3 | sed '# this is a comment
+d'
+ârâ,âRâ,âwâ,âWâ (reading and writing files)
+     The ârâ,âRâ,âwâ,âWâ commands parse the filename until end of the
+     line.  If whitespace, comments or semicolons are found, they will
+     be included in the filename, leading to unexpected results:
+          $ seq 2 | sed '1w hello.txt ; 2d'
+          $ ls -log
+          total 4
+          -rw-rw-r-- 1 2 Jan 23 23:03 hello.txt ; 2d
+          $ cat 'hello.txt ; 2d'
+     Note that âsedâ silently ignores read/write errors in
+     ârâ,âRâ,âwâ,âWâ commands (such as missing files).  In the following
+     example, âsedâ tries to read a file named ââhello.txt ; Nââ.  The
+     file is missing, and the error is silently ignored:
+          $ echo x | sed '1rhello.txt ; N'
+          x
+âeâ (command execution)
+     Any characters following the âeâ command until the end of the line
+     will be sent to the shell.  If whitespace, comments or semicolons
+     are found, they will be included in the shell command, leading to
+     unexpected results:
+          $ echo a | sed '1e touch foo#bar'
+          a
+          $ ls -1
+          foo#bar
+          $ echo a | sed '1e touch foo ; s/a/b/'
+          sh: 1: s/a/b/: not found
+          a
+âs///[we]â (substitute with âeâ or âwâ flags)
+     In a substitution command, the âwâ flag writes the substitution
+     result to a file, and the âeâ flag executes the substitution result
+     as a shell command.  As with the âr/R/w/W/eâ commands, these must
+     be terminated with a newline.  If whitespace, comments or
+     semicolons are found, they will be included in the shell command or
+     filename, leading to unexpected results:
+          $ echo a | sed 's/a/b/w1.txt#foo'
+          b
+          $ ls -1
+.txt#foo
+File: sed.info,  Node: sed addresses,  Next: sed regular expressions,  Prev: sed scripts,  Up: Top
+Addresses: selecting lines
+****************************
+* Menu:
+* Addresses overview::                Addresses overview
+* Numeric Addresses::                 selecting lines by numbers
+* Regexp Addresses::                  selecting lines by text matching
+* Range Addresses::                   selecting a range of lines
+* Zero Address::                      Using address â0â
+File: sed.info,  Node: Addresses overview,  Next: Numeric Addresses,  Up: sed addresses
+.1 Addresses overview
+======================
+Addresses determine on which line(s) the âsedâ command will be executed.
+The following command replaces any first occurrence of âhelloâ with
+âworldâ only on line 144:
+     sed '144s/hello/world/' input.txt > output.txt
+   If no address is specified, the command is performed on all lines.
+The following command replaces âhelloâ with âworldâ, targeting every
+line of the input file.  However, note that it modifies only the first
+instance of âhelloâ on each line.  Use the âgâ modifier to affect every
+instance on each affected line.
+     sed 's/hello/world/' input.txt > output.txt
+   Addresses can contain regular expressions to match lines based on
+content instead of line numbers.  The following command replaces âhelloâ
+with âworldâ only on lines containing the string âappleâ:
+     sed '/apple/s/hello/world/' input.txt > output.txt
+   An address range is specified with two addresses separated by a comma
+(â,â).  Addresses can be numeric, regular expressions, or a mix of both.
+The following command replaces âhelloâ with âworldâ only on lines 4 to
+(inclusive):
+     sed '4,17s/hello/world/' input.txt > output.txt
+   Appending the â!â character to the end of an address specification
+(before the command letter) negates the sense of the match.  That is, if
+the â!â character follows an address or an address range, then only
+lines which do _not_ match the addresses will be selected.  The
+following command replaces âhelloâ with âworldâ only on lines _not_
+containing the string âappleâ:
+     sed '/apple/!s/hello/world/' input.txt > output.txt
+   The following command replaces âhelloâ with âworldâ only on lines 1
+to 3 and from line 18 to the last line of the input file (i.e.
+excluding lines 4 to 17):
+     sed '4,17!s/hello/world/' input.txt > output.txt
+File: sed.info,  Node: Numeric Addresses,  Next: Regexp Addresses,  Prev: Addresses overview,  Up: sed addresses
+.2 Selecting lines by numbers
+==============================
+Addresses in a âsedâ script can be in any of the following forms:
+âNUMBERâ
+     Specifying a line number will match only that line in the input.
+     (Note that âsedâ counts lines continuously across all input files
+     unless â-iâ or â-sâ options are specified.)
+â$â
+     This address matches the last line of the last file of input, or
+     the last line of each file when the â-iâ or â-sâ options are
+     specified.
+âFIRST~STEPâ
+     This GNU extension matches every STEPth line starting with line
+     FIRST.  In particular, lines will be selected when there exists a
+     non-negative N such that the current line-number equals FIRST + (N
+     * STEP).  Thus, one would use â1~2â to select the odd-numbered
+     lines and â0~2â for even-numbered lines; to pick every third line
+     starting with the second, â2~3â would be used; to pick every fifth
+     line starting with the tenth, use â10~5â; and â50~0â is just an
+     obscure way of saying â50â.
+     The following commands demonstrate the step address usage:
+          $ seq 10 | sed -n '0~4p'
+          $ seq 10 | sed -n '1~3p'
+File: sed.info,  Node: Regexp Addresses,  Next: Range Addresses,  Prev: Numeric Addresses,  Up: sed addresses
+.3 selecting lines by text matching
+====================================
+GNU âsedâ supports the following regular expression addresses.  The
+default regular expression is *note Basic Regular Expression (BRE): BRE
+syntax.  If â-Eâ or â-râ options are used, The regular expression should
+be in *note Extended Regular Expression (ERE): ERE syntax. syntax.
+*Note BRE vs ERE::.
+â/REGEXP/â
+     This will select any line which matches the regular expression
+     REGEXP.  If REGEXP itself includes any â/â characters, each must be
+     escaped by a backslash (â\â).
+     The following command prints lines in â/etc/passwdâ which end with
+     âbashâ(1):
+          sed -n '/bash$/p' /etc/passwd
+     The empty regular expression â//â repeats the last regular
+     expression match (the same holds if the empty regular expression is
+     passed to the âsâ command).  Note that modifiers to regular
+     expressions are evaluated when the regular expression is compiled,
+     thus it is invalid to specify them together with the empty regular
+     expression.
+â\%REGEXP%â
+     (The â%â may be replaced by any other single character.)
+     This also matches the regular expression REGEXP, but allows one to
+     use a different delimiter than â/â.  This is particularly useful if
+     the REGEXP itself contains a lot of slashes, since it avoids the
+     tedious escaping of every â/â.  If REGEXP itself includes any
+     delimiter characters, each must be escaped by a backslash (â\â).
+     The following commands are equivalent.  They print lines which
+     start with â/home/alice/documents/â:
+          sed -n '/^\/home\/alice\/documents\//p'
+          sed -n '\%^/home/alice/documents/%p'
+          sed -n '\;^/home/alice/documents/;p'
+â/REGEXP/Iâ
+â\%REGEXP%Iâ
+     The âIâ modifier to regular-expression matching is a GNU extension
+     which causes the REGEXP to be matched in a case-insensitive manner.
+     In many other programming languages, a lower case âiâ is used for
+     case-insensitive regular expression matching.  However, in âsedâ
+     the âiâ is used for the insert command (*note insert command::).
+     Observe the difference between the following examples.
+     In this example, â/b/Iâ is the address: regular expression with âIâ
+     modifier.  âdâ is the delete command:
+          $ printf "%s\n" a b c | sed '/b/Id'
+          a
+          c
+     Here, â/b/â is the address: a regular expression.  âiâ is the
+     insert command.  âdâ is the value to insert.  A line with âdâ is
+     then inserted above the matched line:
+          $ printf "%s\n" a b c | sed '/b/id'
+          a
+          d
+          b
+          c
+â/REGEXP/Mâ
+â\%REGEXP%Mâ
+     The âMâ modifier to regular-expression matching is a GNU âsedâ
+     extension which directs GNU âsedâ to match the regular expression
+     in âmulti-lineâ mode.  The modifier causes â^â and â$â to match
+     respectively (in addition to the normal behavior) the empty string
+     after a newline, and the empty string before a newline.  There are
+     special character sequences (â\`â and â\'â) which always match the
+     beginning or the end of the buffer.  In addition, the period
+     character does not match a new-line character in multi-line mode.
+   Regex addresses operate on the content of the current pattern space.
+If the pattern space is changed (for example with âs///â command) the
+regular expression matching will operate on the changed text.
+   In the following example, automatic printing is disabled with â-nâ.
+The âs/2/X/â command changes lines containing â2â to âXâ.  The command
+â/[0-9]/pâ matches lines with digits and prints them.  Because the
+second line is changed before the â/[0-9]/â regex, it will not match and
+will not be printed:
+     $ seq 3 | sed -n 's/2/X/ ; /[0-9]/p'
+   ---------- Footnotes ----------
+   (1) There are of course many other ways to do the same, e.g.
+     grep 'bash$' /etc/passwd
+     awk -F: '$7 == "/bin/bash"' /etc/passwd
+File: sed.info,  Node: Range Addresses,  Next: Zero Address,  Prev: Regexp Addresses,  Up: sed addresses
+.4 Range Addresses
+===================
+An address range can be specified by specifying two addresses separated
+by a comma (â,â).  An address range matches lines starting from where
+the first address matches, and continues until the second address
+matches (inclusively):
+     $ seq 10 | sed -n '4,6p'
+   If the second address is a REGEXP, then checking for the ending match
+will start with the line _following_ the line which matched the first
+address: a range will always span at least two lines (except of course
+if the input stream ends).
+     $ seq 10 | sed -n '4,/[0-9]/p'
+   If the second address is a NUMBER less than (or equal to) the line
+matching the first address, then only the one line is matched:
+     $ seq 10 | sed -n '4,1p'
+   GNU âsedâ also supports some special two-address forms; all these are
+GNU extensions:
+â0,/REGEXP/â
+     A line number of â0â can be used in an address specification like
+     â0,/REGEXP/â so that âsedâ will try to match REGEXP in the first
+     input line too.  In other words, â0,/REGEXP/â is similar to
+     â1,/REGEXP/â, except that if ADDR2 matches the very first line of
+     input the â0,/REGEXP/â form will consider it to end the range,
+     whereas the â1,/REGEXP/â form will match the beginning of its range
+     and hence make the range span up to the _second_ occurrence of the
+     regular expression.
+     The following examples demonstrate the difference between starting
+     with address 1 and 0:
+          $ seq 10 | sed -n '1,/[0-9]/p'
+          $ seq 10 | sed -n '0,/[0-9]/p'
+âADDR1,+Nâ
+     Matches ADDR1 and the N lines following ADDR1.
+          $ seq 10 | sed -n '6,+2p'
+     ADDR1 can be a line number or a regular expression.
+âADDR1,~Nâ
+     Matches ADDR1 and the lines following ADDR1 until the next line
+     whose input line number is a multiple of N.  The following command
+     prints starting at line 6, until the next line which is a multiple
+     of 4 (i.e.  line 8):
+          $ seq 10 | sed -n '6,~4p'
+     ADDR1 can be a line number or a regular expression.
+File: sed.info,  Node: Zero Address,  Prev: Range Addresses,  Up: sed addresses
+.5 Zero Address
+================
+As a GNU âsedâ extension, â0â address can be used in two cases:
+. In a regex range addresses as â0,/REGEXP/â (*note Zero Address
+     Regex Range::).
+. With the ârâ command, inserting a file before the first line (*note
+     Adding a header to multiple files::).
+   Note that these are the only places where the â0â address makes
+sense; Commands which are given the â0â address in any other way will
+give an error.
+File: sed.info,  Node: sed regular expressions,  Next: advanced sed,  Prev: sed addresses,  Up: Top
+Regular Expressions: selecting text
+*************************************
+* Menu:
+* Regular Expressions Overview:: Overview of Regular expression in âsedâ
+* BRE vs ERE::               Basic (BRE) and extended (ERE) regular expression
+                             syntax
+* BRE syntax::               Overview of basic regular expression syntax
+* ERE syntax::               Overview of extended regular expression syntax
+* Character Classes and Bracket Expressions::
+* regexp extensions::        Additional regular expression commands
+* Back-references and Subexpressions:: Back-references and Subexpressions
+* Escapes::                  Specifying special characters
+* Locale Considerations::    Multibyte characters and locale considerations
+File: sed.info,  Node: Regular Expressions Overview,  Next: BRE vs ERE,  Up: sed regular expressions
+.1 Overview of regular expression in âsedâ
+===========================================
+To know how to use âsedâ, people should understand regular expressions
+(âregexpâ for short).  A regular expression is a pattern that is matched
+against a subject string from left to right.  Most characters are
+âordinaryâ: they stand for themselves in a pattern, and match the
+corresponding characters.  Regular expressions in âsedâ are specified
+between two slashes.
+   The following command prints lines containing the string âhelloâ:
+     sed -n '/hello/p'
+   The above example is equivalent to this âgrepâ command:
+     grep 'hello'
+   The power of regular expressions comes from the ability to include
+alternatives and repetitions in the pattern.  These are encoded in the
+pattern by the use of âspecial charactersâ, which do not stand for
+themselves but instead are interpreted in some special way.
+   The character â^â (caret) in a regular expression matches the
+beginning of the line.  The character â.â (dot) matches any single
+character.  The following âsedâ command matches and prints lines which
+start with the letter âbâ, followed by any single character, followed by
+the letter âdâ:
+     $ printf "%s\n" abode bad bed bit bid byte body | sed -n '/^b.d/p'
+     bad
+     bed
+     bid
+     body
+   The following sections explain the meaning and usage of special
+characters in regular expressions.
+File: sed.info,  Node: BRE vs ERE,  Next: BRE syntax,  Prev: Regular Expressions Overview,  Up: sed regular expressions
+.2 Basic (BRE) and extended (ERE) regular expression
+=====================================================
+Basic and extended regular expressions are two variations on the syntax
+of the specified pattern.  Basic Regular Expression (BRE) syntax is the
+default in âsedâ (and similarly in âgrepâ).  Use the POSIX-specified
+â-Eâ option (â-râ, â--regexp-extendedâ) to enable Extended Regular
+Expression (ERE) syntax.
+   In GNU âsedâ, the only difference between basic and extended regular
+expressions is in the behavior of a few special characters: â?â, â+â,
+parentheses, braces (â{}â), and â|â.
+   With basic (BRE) syntax, these characters do not have special meaning
+unless prefixed with a backslash (â\â); While with extended (ERE) syntax
+it is reversed: these characters are special unless they are prefixed
+with backslash (â\â).
+Desired pattern      Basic (BRE) Syntax         Extended (ERE) Syntax
+--------------------------------------------------------------------------
+literal â+â (plus         $ echo 'a+b=c' > foo       $ echo 'a+b=c' > foo
+sign)                     $ sed -n '/a+b/p' foo      $ sed -E -n '/a\+b/p' foo
+                          a+b=c                      a+b=c
+One or more âaâ           $ echo aab > foo           $ echo aab > foo
+characters                $ sed -n '/a\+b/p' foo     $ sed -E -n '/a+b/p' foo
+followed by âbâ           aab                        aab
+(plus sign as
+special
+meta-character)
+File: sed.info,  Node: BRE syntax,  Next: ERE syntax,  Prev: BRE vs ERE,  Up: sed regular expressions
+.3 Overview of basic regular expression syntax
+===============================================
+Here is a brief description of regular expression syntax as used in
+âsedâ.
+âCHARâ
+     A single ordinary character matches itself.
+â*â
+     Matches a sequence of zero or more instances of matches for the
+     preceding regular expression, which must be an ordinary character,
+     a special character preceded by â\â, a â.â, a grouped regexp (see
+     below), or a bracket expression.  As a GNU extension, a postfixed
+     regular expression can also be followed by â*â; for example, âa**â
+     is equivalent to âa*â.  POSIX 1003.1-2001 says that â*â stands for
+     itself when it appears at the start of a regular expression or
+     subexpression, but many non-GNU implementations do not support this
+     and portable scripts should instead use â\*â in these contexts.
+â.â
+     Matches any character, including newline.
+â^â
+     Matches the null string at beginning of the pattern space, i.e.
+     what appears after the circumflex must appear at the beginning of
+     the pattern space.
+     In most scripts, pattern space is initialized to the content of
+     each line (*note How âsedâ works: Execution Cycle.).  So, it is a
+     useful simplification to think of â^#includeâ as matching only
+     lines where â#includeâ is the first thing on the lineâif there is
+     any preceding space, for example, the match fails.  This
+     simplification is valid as long as the original content of pattern
+     space is not modified, for example with an âsâ command.
+     â^â acts as a special character only at the beginning of the
+     regular expression or subexpression (that is, after â\(â or â\|â).
+     Portable scripts should avoid â^â at the beginning of a
+     subexpression, though, as POSIX allows implementations that treat
+     â^â as an ordinary character in that context.
+â$â
+     It is the same as â^â, but refers to end of pattern space.  â$â
+     also acts as a special character only at the end of the regular
+     expression or subexpression (that is, before â\)â or â\|â), and its
+     use at the end of a subexpression is not portable.
+â[LIST]â
+â[^LIST]â
+     Matches any single character in LIST: for example, â[aeiou]â
+     matches all vowels.  A list may include sequences like
+     âCHAR1-CHAR2â, which matches any character between (inclusive)
+     CHAR1 and CHAR2.  *Note Character Classes and Bracket
+     Expressions::.
+â\+â
+     As â*â, but matches one or more.  It is a GNU extension.
+â\?â
+     As â*â, but only matches zero or one.  It is a GNU extension.
+â\{I\}â
+     As â*â, but matches exactly I sequences (I is a decimal integer;
+     for portability, keep it between 0 and 255 inclusive).
+â\{I,J\}â
+     Matches between I and J, inclusive, sequences.
+â\{I,\}â
+     Matches more than or equal to I sequences.
+â\(REGEXP\)â
+     Groups the inner REGEXP as a whole, this is used to:
+        â¢ Apply postfix operators, like â\(abcd\)*â: this will search
+          for zero or more whole sequences of âabcdâ, while âabcd*â
+          would search for âabcâ followed by zero or more occurrences of
+          âdâ.  Note that support for â\(abcd\)*â is required by POSIX
+.1-2001, but many non-GNU implementations do not support
+          it and hence it is not universally portable.
+        â¢ Use back references (see below).
+âREGEXP1\|REGEXP2â
+     Matches either REGEXP1 or REGEXP2.  Use parentheses to use complex
+     alternative regular expressions.  The matching process tries each
+     alternative in turn, from left to right, and the first one that
+     succeeds is used.  It is a GNU extension.
+âREGEXP1REGEXP2â
+     Matches the concatenation of REGEXP1 and REGEXP2.  Concatenation
+     binds more tightly than â\|â, â^â, and â$â, but less tightly than
+     the other regular expression operators.
+â\DIGITâ
+     Matches the DIGIT-th â\(...\)â parenthesized subexpression in the
+     regular expression.  This is called a âback referenceâ.
+     Subexpressions are implicitly numbered by counting occurrences of
+     â\(â left-to-right.
+â\nâ
+     Matches the newline character.
+â\CHARâ
+     Matches CHAR, where CHAR is one of â$â, â*â, â.â, â[â, â\â, or â^â.
+     Note that the only C-like backslash sequences that you can portably
+     assume to be interpreted are â\nâ and â\\â; in particular â\tâ is
+     not portable, and matches a âtâ under most implementations of
+     âsedâ, rather than a tab character.
+   Note that the regular expression matcher is greedy, i.e., matches are
+attempted from left to right and, if two or more matches are possible
+starting at the same character, it selects the longest.
+Examples:
+âabcdefâ
+     Matches âabcdefâ.
+âa*bâ
+     Matches zero or more âaâs followed by a single âbâ.  For example,
+     âbâ or âaaaaabâ.
+âa\?bâ
+     Matches âbâ or âabâ.
+âa\+b\+â
+     Matches one or more âaâs followed by one or more âbâs: âabâ is the
+     shortest possible match, but other examples are âaaaabâ or âabbbbbâ
+     or âaaaaaabbbbbbbâ.
+â.*â
+â.\+â
+     These two both match all the characters in a string; however, the
+     first matches every string (including the empty string), while the
+     second matches only strings containing at least one character.
+â^main.*(.*)â
+     This matches a string starting with âmainâ, followed by an opening
+     and closing parenthesis.  The ânâ, â(â and â)â need not be
+     adjacent.
+â^#â
+     This matches a string beginning with â#â.
+â\\$â
+     This matches a string ending with a single backslash.  The regexp
+     contains two backslashes for escaping.
+â\$â
+     Instead, this matches a string consisting of a single dollar sign,
+     because it is escaped.
+â[a-zA-Z0-9]â
+     In the C locale, this matches any ASCII letters or digits.
+â[^ â<TAB>â]\+â
+     (Here â<TAB>â stands for a single tab character.)  This matches a
+     string of one or more characters, none of which is a space or a
+     tab.  Usually this means a word.
+â^\(.*\)\n\1$â
+     This matches a string consisting of two equal substrings separated
+     by a newline.
+â.\{9\}A$â
+     This matches nine characters followed by an âAâ at the end of a
+     line.
+â^.\{15\}Aâ
+     This matches the start of a string that contains 16 characters, the
+     last of which is an âAâ.
+File: sed.info,  Node: ERE syntax,  Next: Character Classes and Bracket Expressions,  Prev: BRE syntax,  Up: sed regular expressions
+.4 Overview of extended regular expression syntax
+==================================================
+The only difference between basic and extended regular expressions is in
+the behavior of a few characters: â?â, â+â, parentheses, braces (â{}â),
+and â|â.  While basic regular expressions require these to be escaped if
+you want them to behave as special characters, when using extended
+regular expressions you must escape them if you want them _to match a
+literal character_.  â|â is special here because â\|â is a GNU extension
+â standard basic regular expressions do not provide its functionality.
+Examples:
+âabc?â
+     becomes âabc\?â when using extended regular expressions.  It
+     matches the literal string âabc?â.
+âc\+â
+     becomes âc+â when using extended regular expressions.  It matches
+     one or more âcâs.
+âa\{3,\}â
+     becomes âa{3,}â when using extended regular expressions.  It
+     matches three or more âaâs.
+â\(abc\)\{2,3\}â
+     becomes â(abc){2,3}â when using extended regular expressions.  It
+     matches either âabcabcâ or âabcabcabcâ.
+â\(abc*\)\1â
+     becomes â(abc*)\1â when using extended regular expressions.
+     Backreferences must still be escaped when using extended regular
+     expressions.
+âa\|bâ
+     becomes âa|bâ when using extended regular expressions.  It matches
+     âaâ or âbâ.
+File: sed.info,  Node: Character Classes and Bracket Expressions,  Next: regexp extensions,  Prev: ERE syntax,  Up: sed regular expressions
+.5 Character Classes and Bracket Expressions
+=============================================
+A âbracket expressionâ is a list of characters enclosed by â[â and â]â.
+It matches any single character in that list; if the first character of
+the list is the caret â^â, then it matches any character *not* in the
+list.  For example, the following command replaces the strings âgrayâ or
+âgreyâ with âblueâ:
+     sed  's/gr[ae]y/blue/'
+   Bracket expressions can be used in both *note basic: BRE syntax. and
+*note extended: ERE syntax. regular expressions (that is, with or
+without the â-Eâ/â-râ options).
+   Within a bracket expression, a ârange expressionâ consists of two
+characters separated by a hyphen.  It matches any single character that
+sorts between the two characters, inclusive.  In the default C locale,
+the sorting sequence is the native character order; for example, â[a-d]â
+is equivalent to â[abcd]â.
+   Finally, certain named classes of characters are predefined within
+bracket expressions, as follows.
+   These named classes must be used _inside_ brackets themselves.
+Correct usage:
+     $ echo 1 | sed 's/[[:digit:]]/X/'
+     X
+   Incorrect usage is rejected by newer âsedâ versions.  Older versions
+accepted it but treated it as a single bracket expression (which is
+equivalent to â[dgit:]â, that is, only the characters D/G/I/T/:):
+     # current GNU sed versions - incorrect usage rejected
+     $ echo 1 | sed 's/[:digit:]/X/'
+     sed: character class syntax is [[:space:]], not [:space:]
+     # older GNU sed versions
+     $ echo 1 | sed 's/[:digit:]/X/'
+â[:alnum:]â
+     Alphanumeric characters: â[:alpha:]â and â[:digit:]â; in the âCâ
+     locale and ASCII character encoding, this is the same as
+     â[0-9A-Za-z]â.
+â[:alpha:]â
+     Alphabetic characters: â[:lower:]â and â[:upper:]â; in the âCâ
+     locale and ASCII character encoding, this is the same as
+     â[A-Za-z]â.
+â[:blank:]â
+     Blank characters: space and tab.
+â[:cntrl:]â
+     Control characters.  In ASCII, these characters have octal codes
+through 037, and 177 (DEL). In other character sets, these are
+     the equivalent characters, if any.
+â[:digit:]â
+     Digits: â0 1 2 3 4 5 6 7 8 9â.
+â[:graph:]â
+     Graphical characters: â[:alnum:]â and â[:punct:]â.
+â[:lower:]â
+     Lower-case letters; in the âCâ locale and ASCII character encoding,
+     this is âa b c d e f g h i j k l m n o p q r s t u v w x y zâ.
+â[:print:]â
+     Printable characters: â[:alnum:]â, â[:punct:]â, and space.
+â[:punct:]â
+     Punctuation characters; in the âCâ locale and ASCII character
+     encoding, this is â! " # $ % & ' ( ) * + , - . / : ; < = > ? @ [ \
+     ] ^ _ ` { | } ~â.
+â[:space:]â
+     Space characters: in the âCâ locale, this is tab, newline, vertical
+     tab, form feed, carriage return, and space.
+â[:upper:]â
+     Upper-case letters: in the âCâ locale and ASCII character encoding,
+     this is âA B C D E F G H I J K L M N O P Q R S T U V W X Y Zâ.
+â[:xdigit:]â
+     Hexadecimal digits: â0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e fâ.
+   Note that the brackets in these class names are part of the symbolic
+names, and must be included in addition to the brackets delimiting the
+bracket expression.
+   Most meta-characters lose their special meaning inside bracket
+expressions:
+â]â
+     ends the bracket expression if itâs not the first list item.  So,
+     if you want to make the â]â character a list item, you must put it
+     first.
+â-â
+     represents the range if itâs not first or last in a list or the
+     ending point of a range.
+â^â
+     represents the characters not in the list.  If you want to make the
+     â^â character a list item, place it anywhere but first.
+   TODO: incorporate this paragraph (copied verbatim from BRE section).
+   The characters â$â, â*â, â.â, â[â, and â\â are normally not special
+within LIST.  For example, â[\*]â matches either â\â or â*â, because the
+â\â is not special here.  However, strings like â[.ch.]â, â[=a=]â, and
+â[:space:]â are special within LIST and represent collating symbols,
+equivalence classes, and character classes, respectively, and â[â is
+therefore special within LIST when it is followed by â.â, â=â, or â:â.
+Also, when not in âPOSIXLY_CORRECTâ mode, special escapes like â\nâ and
+â\tâ are recognized within LIST.  *Note Escapes::.
+â[.â
+     represents the open collating symbol.
+â.]â
+     represents the close collating symbol.
+â[=â
+     represents the open equivalence class.
+â=]â
+     represents the close equivalence class.
+â[:â
+     represents the open character class symbol, and should be followed
+     by a valid character class name.
+â:]â
+     represents the close character class symbol.
+File: sed.info,  Node: regexp extensions,  Next: Back-references and Subexpressions,  Prev: Character Classes and Bracket Expressions,  Up: sed regular expressions
+.6 regular expression extensions
+=================================
+The following sequences have special meaning inside regular expressions
+(used in *note addresses: Regexp Addresses. and the âsâ command).
+   These can be used in both *note basic: BRE syntax. and *note
+extended: ERE syntax. regular expressions (that is, with or without the
+â-Eâ/â-râ options).
+â\wâ
+     Matches any âwordâ character.  A âwordâ character is any letter or
+     digit or the underscore character.
+          $ echo "abc %-= def." | sed 's/\w/X/g'
+          XXX %-= XXX.
+â\Wâ
+     Matches any ânon-wordâ character.
+          $ echo "abc %-= def." | sed 's/\W/X/g'
+          abcXXXXXdefX
+â\bâ
+     Matches a word boundary; that is it matches if the character to the
+     left is a âwordâ character and the character to the right is a
+     ânon-wordâ character, or vice-versa.
+          $ echo "abc %-= def." | sed 's/\b/X/g'
+          XabcX %-= XdefX.
+â\Bâ
+     Matches everywhere but on a word boundary; that is it matches if
+     the character to the left and the character to the right are either
+     both âwordâ characters or both ânon-wordâ characters.
+          $ echo "abc %-= def." | sed 's/\B/X/g'
+          aXbXc X%X-X=X dXeXf.X
+â\sâ
+     Matches whitespace characters (spaces and tabs).  Newlines embedded
+     in the pattern/hold spaces will also match:
+          $ echo "abc %-= def." | sed 's/\s/X/g'
+          abcX%-=Xdef.
+â\Sâ
+     Matches non-whitespace characters.
+          $ echo "abc %-= def." | sed 's/\S/X/g'
+          XXX XXX XXXX
+â\<â
+     Matches the beginning of a word.
+          $ echo "abc %-= def." | sed 's/\</X/g'
+          Xabc %-= Xdef.
+â\>â
+     Matches the end of a word.
+          $ echo "abc %-= def." | sed 's/\>/X/g'
+          abcX %-= defX.
+â\`â
+     Matches only at the start of pattern space.  This is different from
+     â^â in multi-line mode.
+     Compare the following two examples:
+          $ printf "a\nb\nc\n" | sed 'N;N;s/^/X/gm'
+          Xa
+          Xb
+          Xc
+          $ printf "a\nb\nc\n" | sed 'N;N;s/\`/X/gm'
+          Xa
+          b
+          c
+â\'â
+     Matches only at the end of pattern space.  This is different from
+     â$â in multi-line mode.
+File: sed.info,  Node: Back-references and Subexpressions,  Next: Escapes,  Prev: regexp extensions,  Up: sed regular expressions
+.7 Back-references and Subexpressions
+======================================
+âback-referencesâ are regular expression commands which refer to a
+previous part of the matched regular expression.  Back-references are
+specified with backslash and a single digit (e.g.  â\1â).  The part of
+the regular expression they refer to is called a âsubexpressionâ, and is
+designated with parentheses.
+   Back-references and subexpressions are used in two cases: in the
+regular expression search pattern, and in the REPLACEMENT part of the
+âsâ command (*note Regular Expression Addresses: Regexp Addresses. and
+*note The "s" Command::).
+   In a regular expression pattern, back-references are used to match
+the same content as a previously matched subexpression.  In the
+following example, the subexpression is â.â - any single character
+(being surrounded by parentheses makes it a subexpression).  The
+back-reference â\1â asks to match the same content (same character) as
+the sub-expression.
+   The command below matches words starting with any character, followed
+by the letter âoâ, followed by the same character as the first.
+     $ sed -E -n '/^(.)o\1$/p' /usr/share/dict/words
+     bob
+     mom
+     non
+     pop
+     sos
+     tot
+     wow
+   Multiple subexpressions are automatically numbered from
+left-to-right.  This command searches for 6-letter palindromes (the
+first three letters are 3 subexpressions, followed by 3 back-references
+in reverse order):
+     $ sed -E -n '/^(.)(.)(.)\3\2\1$/p' /usr/share/dict/words
+     redder
+   In the âsâ command, back-references can be used in the REPLACEMENT
+part to refer back to subexpressions in the REGEXP part.
+   The following example uses two subexpressions in the regular
+expression to match two space-separated words.  The back-references in
+the REPLACEMENT part prints the words in a different order:
+     $ echo "James Bond" | sed -E 's/(.*) (.*)/The name is \2, \1 \2./'
+     The name is Bond, James Bond.
+   When used with alternation, if the group does not participate in the
+match then the back-reference makes the whole match fail.  For example,
+âa(.)|b\1â will not match âbaâ.  When multiple regular expressions are
+given with â-eâ or from a file (â-f FILEâ), back-references are local to
+each expression.
+File: sed.info,  Node: Escapes,  Next: Locale Considerations,  Prev: Back-references and Subexpressions,  Up: sed regular expressions
+.8 Escape Sequences - specifying special characters
+====================================================
+Until this chapter, we have only encountered escapes of the form â\^â,
+which tell âsedâ not to interpret the circumflex as a special character,
+but rather to take it literally.  For example, â\*â matches a single
+asterisk rather than zero or more backslashes.
+   This chapter introduces another kind of escape(1)âthat is, escapes
+that are applied to a character or sequence of characters that
+ordinarily are taken literally, and that âsedâ replaces with a special
+character.  This provides a way of encoding non-printable characters in
+patterns in a visible manner.  There is no restriction on the appearance
+of non-printing characters in a âsedâ script but when a script is being
+prepared in the shell or by text editing, it is usually easier to use
+one of the following escape sequences than the binary character it
+represents:
+   The list of these escapes is:
+â\aâ
+     Produces or matches a BEL character, that is an âalertâ (ASCII 7).
+â\fâ
+     Produces or matches a form feed (ASCII 12).
+â\nâ
+     Produces or matches a newline (ASCII 10).
+â\râ
+     Produces or matches a carriage return (ASCII 13).
+â\tâ
+     Produces or matches a horizontal tab (ASCII 9).
+â\vâ
+     Produces or matches a so called âvertical tabâ (ASCII 11).
+â\cXâ
+     Produces or matches âCONTROL-Xâ, where X is any character.  The
+     precise effect of â\cXâ is as follows: if X is a lower case letter,
+     it is converted to upper case.  Then bit 6 of the character (hex
+) is inverted.  Thus â\czâ becomes hex 1A, but â\c{â becomes hex
+B, while â\c;â becomes hex 7B.
+â\dXXXâ
+     Produces or matches a character whose decimal ASCII value is XXX.
+â\oXXXâ
+     Produces or matches a character whose octal ASCII value is XXX.
+â\xXXâ
+     Produces or matches a character whose hexadecimal ASCII value is
+     XX.
+   â\bâ (backspace) was omitted because of the conflict with the
+existing âword boundaryâ meaning.
+.8.1 Escaping Precedence
+-------------------------
+GNU âsedâ processes escape sequences _before_ passing the text onto the
+regular-expression matching of the âs///â command and Address matching.
+Thus the following two commands are equivalent (â0x5eâ is the
+hexadecimal ASCII value of the character â^â):
+     $ echo 'a^c' | sed 's/^/b/'
+     ba^c
+     $ echo 'a^c' | sed 's/\x5e/b/'
+     ba^c
+   As are the following (â0x5bâ,â0x5dâ are the hexadecimal ASCII values
+of â[â,â]â, respectively):
+     $ echo abc | sed 's/[a]/x/'
+     Xbc
+     $ echo abc | sed 's/\x5ba\x5d/x/'
+     Xbc
+   However it is recommended to avoid such special characters due to
+unexpected edge-cases.  For example, the following are not equivalent:
+     $ echo 'a^c' | sed 's/\^/b/'
+     abc
+     $ echo 'a^c' | sed 's/\\\x5e/b/'
+     a^c
+   ---------- Footnotes ----------
+   (1) All the escapes introduced here are GNU extensions, with the
+exception of â\nâ.  In basic regular expression mode, setting
+âPOSIXLY_CORRECTâ disables them inside bracket expressions.
+File: sed.info,  Node: Locale Considerations,  Prev: Escapes,  Up: sed regular expressions
+.9 Multibyte characters and Locale Considerations
+==================================================
+GNU âsedâ processes valid multibyte characters in multibyte locales
+(e.g.  âUTF-8â).  (1)
+The following example uses the Greek letter Capital Sigma (Î£, Unicode
+code point â0x03A3â).  In a âUTF-8â locale, âsedâ correctly processes
+the Sigma as one character despite it being 2 octets (bytes):
+     $ locale | grep LANG
+     LANG=en_US.UTF-8
+     $ printf 'a\u03A3b'
+     aÎ£b
+     $ printf 'a\u03A3b' | sed 's/./X/g'
+     XXX
+     $ printf 'a\u03A3b' | od -tx1 -An
+ce a3 62
+To force âsedâ to process octets separately, use the âCâ locale (also
+known as the âPOSIXâ locale):
+     $ printf 'a\u03A3b' | LC_ALL=C sed 's/./X/g'
+     XXXX
+.9.1 Invalid multibyte characters
+----------------------------------
+âsedââs regular expressions _do not_ match invalid multibyte sequences
+in a multibyte locale.
+In the following examples, the ascii value â0xCEâ is an incomplete
+multibyte character (shown here as ï¿œ).  The regular expression â.â does
+not match it:
+     $ printf 'a\xCEb\n'
+     aï¿œe
+     $ printf 'a\xCEb\n' | sed 's/./X/g'
+     Xï¿œX
+     $ printf 'a\xCEc\n' | sed 's/./X/g' | od -tx1c -An
+  ce  58  0a
+        X      X   \n
+Similarly, the âcatch-allâ regular expression â.*â does not match the
+entire line:
+     $ printf 'a\xCEc\n' | sed 's/.*//' | od -tx1c -An
+       ce  63  0a
+            c  \n
+GNU âsedâ offers the special âzâ command to clear the current pattern
+space regardless of invalid multibyte characters (i.e.  it works like
+âs/.*//â but also removes invalid multibyte characters):
+     $ printf 'a\xCEc\n' | sed 'z' | od -tx1c -An
+a
+        \n
+Alternatively, force the âCâ locale to process each octet separately
+(every octet is a valid character in the âCâ locale):
+     $ printf 'a\xCEc\n' | LC_ALL=C sed 's/.*//' | od -tx1c -An
+a
+       \n
+   âsedââs inability to process invalid multibyte characters can be used
+to detect such invalid sequences in a file.  In the following examples,
+the â\xCE\xCEâ is an invalid multibyte sequence, while â\xCE\A3â is a
+valid multibyte sequence (of the Greek Sigma character).
+The following âsedâ program removes all valid characters using âs/.//gâ.
+Any content left in the pattern space (the invalid characters) are added
+to the hold space using the âHâ command.  On the last line (â$â), the
+hold space is retrieved (âxâ), newlines are removed (âs/\n//gâ), and any
+remaining octets are printed unambiguously (âlâ).  Thus, any invalid
+multibyte sequences are printed as octal values:
+     $ printf 'ab\nc\n\xCE\xCEde\n\xCE\xA3f\n' > invalid.txt
+     $ cat invalid.txt
+     ab
+     c
+     ï¿œï¿œde
+     Î£f
+     $ sed -n 's/.//g ; H ; ${x;s/\n//g;l}' invalid.txt
+     \316\316$
+With a few more commands, âsedâ can print the exact line number
+corresponding to each invalid characters (line 3).  These characters can
+then be removed by forcing the âCâ locale and using octal escape
+sequences:
+     $ sed -n 's/.//g;=;l' invalid.txt | paste - -  | awk '$2!="$"'
+       \316\316$
+     $ LC_ALL=C sed '3s/\o316\o316//' invalid.txt > fixed.txt
+.9.2 Upper/Lower case conversion
+---------------------------------
+GNU âsedââs substitute command (âsâ) supports upper/lower case
+conversions using â\Uâ,â\Lâ codes.  These conversions support multibyte
+characters:
+     $ printf 'ABC\u03a3\n'
+     ABCÎ£
+     $ printf 'ABC\u03a3\n' | sed 's/.*/\L&/'
+     abcÏ
+*Note The "s" Command::.
+.9.3 Multibyte regexp character classes
+----------------------------------------
+In other locales, the sorting sequence is not specified, and â[a-d]â
+might be equivalent to â[abcd]â or to â[aBbCcDd]â, or it might fail to
+match any character, or the set of characters that it matches might even
+be erratic.  To obtain the traditional interpretation of bracket
+expressions, you can use the âCâ locale by setting the âLC_ALLâ
+environment variable to the value âCâ.
+     # TODO: is there any real-world system/locale where 'A'
+     #       is replaced by '-' ?
+     $ echo A | sed 's/[a-z]/-/'
+     A
+   Their interpretation depends on the âLC_CTYPEâ locale; for example,
+â[[:alnum:]]â means the character class of numbers and letters in the
+current locale.
+   TODO: show example of collation
+     # TODO: this works on glibc systems, not on musl-libc/freebsd/macosx.
+     $ printf 'clichÃ©\n' | LC_ALL=fr_FR.utf8 sed 's/[[=e=]]/X/g'
+     clichX
+   ---------- Footnotes ----------
+   (1) Some regexp edge-cases depends on the operating system and libc
+implementation.  The examples shown are known to work as-expected on
+GNU/Linux systems using glibc.
+File: sed.info,  Node: advanced sed,  Next: Examples,  Prev: sed regular expressions,  Up: Top
+Advanced âsedâ: cycles and buffers
+************************************
+* Menu:
+* Execution Cycle::          How âsedâ works
+* Hold and Pattern Buffers::
+* Multiline techniques::     Using D,G,H,N,P to process multiple lines
+* Branching and flow control::
+File: sed.info,  Node: Execution Cycle,  Next: Hold and Pattern Buffers,  Up: advanced sed
+.1 How âsedâ Works
+===================
+âsedâ maintains two data buffers: the active _pattern_ space, and the
+auxiliary _hold_ space.  Both are initially empty.
+   âsedâ operates by performing the following cycle on each line of
+input: first, âsedâ reads one line from the input stream, removes any
+trailing newline, and places it in the pattern space.  Then commands are
+executed; each command can have an address associated to it: addresses
+are a kind of condition code, and a command is only executed if the
+condition is verified before the command is to be executed.
+   When the end of the script is reached, unless the â-nâ option is in
+use, the contents of pattern space are printed out to the output stream,
+adding back the trailing newline if it was removed.(1)  Then the next
+cycle starts for the next input line.
+   Unless special commands (like âDâ) are used, the pattern space is
+deleted between two cycles.  The hold space, on the other hand, keeps
+its data between cycles (see commands âhâ, âHâ, âxâ, âgâ, âGâ to move
+data between both buffers).
+   ---------- Footnotes ----------
+   (1) Actually, if âsedâ prints a line without the terminating newline,
+it will nevertheless print the missing newline as soon as more text is
+sent to the same output stream, which gives the âleast expected
+surpriseâ even though it does not make commands like âsed -n pâ exactly
+identical to âcatâ.
+File: sed.info,  Node: Hold and Pattern Buffers,  Next: Multiline techniques,  Prev: Execution Cycle,  Up: advanced sed
+.2 Hold and Pattern Buffers
+============================
+TODO
+File: sed.info,  Node: Multiline techniques,  Next: Branching and flow control,  Prev: Hold and Pattern Buffers,  Up: advanced sed
+.3 Multiline techniques - using D,G,H,N,P to process multiple lines
+====================================================================
+Multiple lines can be processed as one buffer using the
+âDâ,âGâ,âHâ,âNâ,âPâ.  They are similar to their lowercase counterparts
+(âdâ,âgâ, âhâ,ânâ,âpâ), except that these commands append or subtract
+data while respecting embedded newlines - allowing adding and removing
+lines from the pattern and hold spaces.
+   They operate as follows:
+âDâ
+     _deletes_ line from the pattern space until the first newline, and
+     restarts the cycle.
+âGâ
+     _appends_ line from the hold space to the pattern space, with a
+     newline before it.
+âHâ
+     _appends_ line from the pattern space to the hold space, with a
+     newline before it.
+âNâ
+     _appends_ line from the input file to the pattern space.
+âPâ
+     _prints_ line from the pattern space until the first newline.
+   The following example illustrates the operation of âNâ and âDâ
+commands:
+     $ seq 6 | sed -n 'N;l;D'
+\n2$
+\n3$
+\n4$
+\n5$
+\n6$
+. âsedâ starts by reading the first line into the pattern space (i.e.
+     â1â).
+. At the beginning of every cycle, the âNâ command appends a newline
+     and the next line to the pattern space (i.e.  â1â, â\nâ, â2â in the
+     first cycle).
+. The âlâ command prints the content of the pattern space
+     unambiguously.
+. The âDâ command then removes the content of pattern space up to the
+     first newline (leaving â2â at the end of the first cycle).
+. At the next cycle the âNâ command appends a newline and the next
+     input line to the pattern space (e.g.  â2â, â\nâ, â3â).
+   A common technique to process blocks of text such as paragraphs
+(instead of line-by-line) is using the following construct:
+     sed '/./{H;$!d} ; x ; s/REGEXP/REPLACEMENT/'
+. The first expression, â/./{H;$!d}â operates on all non-empty lines,
+     and adds the current line (in the pattern space) to the hold space.
+     On all lines except the last, the pattern space is deleted and the
+     cycle is restarted.
+. The other expressions âxâ and âsâ are executed only on empty lines
+     (i.e.  paragraph separators).  The âxâ command fetches the
+     accumulated lines from the hold space back to the pattern space.
+     The âs///â command then operates on all the text in the paragraph
+     (including the embedded newlines).
+   The following example demonstrates this technique:
+     $ cat input.txt
+     a a a aa aaa
+     aaaa aaaa aa
+     aaaa aaa aaa
+     bbbb bbb bbb
+     bb bb bbb bb
+     bbbbbbbb bbb
+     ccc ccc cccc
+     cccc ccccc c
+     cc cc cc cc
+     $ sed '/./{H;$!d} ; x ; s/^/\nSTART-->/ ; s/$/\n<--END/' input.txt
+     START-->
+     a a a aa aaa
+     aaaa aaaa aa
+     aaaa aaa aaa
+     <--END
+     START-->
+     bbbb bbb bbb
+     bb bb bbb bb
+     bbbbbbbb bbb
+     <--END
+     START-->
+     ccc ccc cccc
+     cccc ccccc c
+     cc cc cc cc
+     <--END
+   For more annotated examples, *note Text search across multiple
+lines:: and *note Line length adjustment::.
+File: sed.info,  Node: Branching and flow control,  Prev: Multiline techniques,  Up: advanced sed
+.4 Branching and Flow Control
+==============================
+The branching commands âbâ, âtâ, and âTâ enable changing the flow of
+âsedâ programs.
+   By default, âsedâ reads an input line into the pattern buffer, then
+continues to processes all commands in order.  Commands without
+addresses affect all lines.  Commands with addresses affect only
+matching lines.  *Note Execution Cycle:: and *note Addresses overview::.
+   âsedâ does not support a typical âif/thenâ construct.  Instead, some
+commands can be used as conditionals or to change the default flow
+control:
+âdâ
+     delete (clears) the current pattern space, and restart the program
+     cycle without processing the rest of the commands and without
+     printing the pattern space.
+âDâ
+     delete the contents of the pattern space _up to the first newline_,
+     and restart the program cycle without processing the rest of the
+     commands and without printing the pattern space.
+â[addr]Xâ
+â[addr]{ X ; X ; X }â
+â/regexp/Xâ
+â/regexp/{ X ; X ; X }â
+     Addresses and regular expressions can be used as an âif/thenâ
+     conditional: If [ADDR] matches the current pattern space, execute
+     the command(s).  For example: The command â/^#/dâ means: _if_ the
+     current pattern matches the regular expression â^#â (a line
+     starting with a hash), _then_ execute the âdâ command: delete the
+     line without printing it, and restart the program cycle
+     immediately.
+âbâ
+     branch unconditionally (that is: always jump to a label, skipping
+     or repeating other commands, without restarting a new cycle).
+     Combined with an address, the branch can be conditionally executed
+     on matched lines.
+âtâ
+     branch conditionally (that is: jump to a label) _only if_ a âs///â
+     command has succeeded since the last input line was read or another
+     conditional branch was taken.
+âTâ
+     similar but opposite to the âtâ command: branch only if there has
+     been _no_ successful substitutions since the last input line was
+     read.
+   The following two âsedâ programs are equivalent.  The first
+(contrived) example uses the âbâ command to skip the âs///â command on
+lines containing â1â.  The second example uses an address with negation
+(â!â) to perform substitution only on desired lines.  The ây///â command
+is still executed on all lines:
+     $ printf '%s\n' a1 a2 a3 | sed -E '/1/bx ; s/a/z/ ; :x ; y/123/456/'
+     a4
+     z5
+     z6
+     $ printf '%s\n' a1 a2 a3 | sed -E '/1/!s/a/z/ ; y/123/456/'
+     a4
+     z5
+     z6
+.4.1 Branching and Cycles
+--------------------------
+The âbâ,âtâ and âTâ commands can be followed by a label (typically a
+single letter).  Labels are defined with a colon followed by one or more
+letters (e.g.  â:xâ).  If the label is omitted the branch commands
+restart the cycle.  Note the difference between branching to a label and
+restarting the cycle: when a cycle is restarted, âsedâ first prints the
+current content of the pattern space, then reads the next input line
+into the pattern space; Jumping to a label (even if it is at the
+beginning of the program) does not print the pattern space and does not
+read the next input line.
+   The following program is a no-op.  The âbâ command (the only command
+in the program) does not have a label, and thus simply restarts the
+cycle.  On each cycle, the pattern space is printed and the next input
+line is read:
+     $ seq 3 | sed b
+   The following example is an infinite-loop - it doesnât terminate and
+doesnât print anything.  The âbâ command jumps to the âxâ label, and a
+new cycle is never started:
+     $ seq 3 | sed ':x ; bx'
+     # The above command requires gnu sed (which supports additional
+     # commands following a label, without a newline). A portable equivalent:
+     #     sed -e ':x' -e bx
+   Branching is often complemented with the ânâ or âNâ commands: both
+commands read the next input line into the pattern space without waiting
+for the cycle to restart.  Before reading the next input line, ânâ
+prints the current pattern space then empties it, while âNâ appends a
+newline and the next input line to the pattern space.
+   Consider the following two examples:
+     $ seq 3 | sed ':x ; n ; bx'
+     $ seq 3 | sed ':x ; N ; bx'
+   â¢ Both examples do not inf-loop, despite never starting a new cycle.
+   â¢ In the first example, the ânâ commands first prints the content of
+     the pattern space, empties the pattern space then reads the next
+     input line.
+   â¢ In the second example, the âNâ commands appends the next input line
+     to the pattern space (with a newline).  Lines are accumulated in
+     the pattern space until there are no more input lines to read, then
+     the âNâ command terminates the âsedâ program.  When the program
+     terminates, the end-of-cycle actions are performed, and the entire
+     pattern space is printed.
+   â¢ The second example requires GNU âsedâ, because it uses the
+     non-POSIX-standard behavior of âNâ.  See the ââNâ command on the
+     last lineâ paragraph in *note Reporting Bugs::.
+   â¢ To further examine the difference between the two examples, try the
+     following commands:
+          printf '%s\n' aa bb cc dd | sed ':x ; n ; = ; bx'
+          printf '%s\n' aa bb cc dd | sed ':x ; N ; = ; bx'
+          printf '%s\n' aa bb cc dd | sed ':x ; n ; s/\n/***/ ; bx'
+          printf '%s\n' aa bb cc dd | sed ':x ; N ; s/\n/***/ ; bx'
+.4.2 Branching example: joining lines
+--------------------------------------
+As a real-world example of using branching, consider the case of
+quoted-printable (https://en.wikipedia.org/wiki/Quoted-printable) files,
+typically used to encode email messages.  In these files long lines are
+split and marked with a âsoft line breakâ consisting of a single â=â
+character at the end of the line:
+     $ cat jaques.txt
+     All the wor=
+     ld's a stag=
+     e,
+     And all the=
+      men and wo=
+     men merely =
+     players:
+     They have t=
+     heir exits =
+     and their e=
+     ntrances;
+     And one man=
+      in his tim=
+     e plays man=
+     y parts.
+   The following program uses an address match â/=$/â as a conditional:
+If the current pattern space ends with a â=â, it reads the next input
+line using âNâ, replaces all â=â characters which are followed by a
+newline, and unconditionally branches (âbâ) to the beginning of the
+program without restarting a new cycle.  If the pattern space does not
+ends with â=â, the default action is performed: the pattern space is
+printed and a new cycle is started:
+     $ sed ':x ; /=$/ { N ; s/=\n//g ; bx }' jaques.txt
+     All the world's a stage,
+     And all the men and women merely players:
+     They have their exits and their entrances;
+     And one man in his time plays many parts.
+   Hereâs an alternative program with a slightly different approach: On
+all lines except the last, âNâ appends the line to the pattern space.  A
+substitution command then removes soft line breaks (â=â at the end of a
+line, i.e.  followed by a newline) by replacing them with an empty
+string.  _if_ the substitution was successful (meaning the pattern space
+contained a line which should be joined), The conditional branch command
+âtâ jumps to the beginning of the program without completing or
+restarting the cycle.  If the substitution failed (meaning there were no
+soft line breaks), The âtâ command will _not_ branch.  Then, âPâ will
+print the pattern space content until the first newline, and âDâ will
+delete the pattern space content until the first new line.  (To learn
+more about âNâ, âPâ and âDâ commands *note Multiline techniques::).
+     $ sed ':x ; $!N ; s/=\n// ; tx ; P ; D' jaques.txt
+     All the world's a stage,
+     And all the men and women merely players:
+     They have their exits and their entrances;
+     And one man in his time plays many parts.
+   For more line-joining examples *note Joining lines::.
+File: sed.info,  Node: Examples,  Next: Limitations,  Prev: advanced sed,  Up: Top
+Some Sample Scripts
+*********************
+Here are some âsedâ scripts to guide you in the art of mastering âsedâ.
+* Menu:
+Useful one-liners:
+* Joining lines::
+Some exotic examples:
+* Centering lines::
+* Increment a number::
+* Rename files to lower case::
+* Print bash environment::
+* Reverse chars of lines::
+* Text search across multiple lines::
+* Line length adjustment::
+* Adding a header to multiple files::
+Emulating standard utilities:
+* tac::                             Reverse lines of files
+* cat -n::                          Numbering lines
+* cat -b::                          Numbering non-blank lines
+* wc -c::                           Counting chars
+* wc -w::                           Counting words
+* wc -l::                           Counting lines
+* head::                            Printing the first lines
+* tail::                            Printing the last lines
+* uniq::                            Make duplicate lines unique
+* uniq -d::                         Print duplicated lines of input
+* uniq -u::                         Remove all duplicated lines
+* cat -s::                          Squeezing blank lines
+File: sed.info,  Node: Joining lines,  Next: Centering lines,  Up: Examples
+.1 Joining lines
+=================
+This section uses âNâ, âDâ and âPâ commands to process multiple lines,
+and the âbâ and âtâ commands for branching.  *Note Multiline
+techniques:: and *note Branching and flow control::.
+   Join specific lines (e.g.  if lines 2 and 3 need to be joined):
+     $ cat lines.txt
+     hello
+     hel
+     lo
+     hello
+     $ sed '2{N;s/\n//;}' lines.txt
+     hello
+     hello
+     hello
+   Join backslash-continued lines:
+     $ cat 1.txt
+     this \
+     is \
+     a \
+     long \
+     line
+     and another \
+     line
+     $ sed -e ':x /\\$/ { N; s/\\\n//g ; bx }'  1.txt
+     this is a long line
+     and another line
+     #TODO: The above requires gnu sed.
+     #      non-gnu seds need newlines after ':' and 'b'
+   Join lines that start with whitespace (e.g SMTP headers):
+     $ cat 2.txt
+     Subject: Hello
+         World
+     Content-Type: multipart/alternative;
+         boundary=94eb2c190cc6370f06054535da6a
+     Date: Tue, 3 Jan 2017 19:41:16 +0000 (GMT)
+     Authentication-Results: mx.gnu.org;
+            dkim=pass header.i=@gnu.org;
+            spf=pass
+     Message-ID: <abcdef@gnu.org>
+     From: John Doe <jdoe@gnu.org>
+     To: Jane Smith <jsmith@gnu.org>
+     $ sed -E ':a ; $!N ; s/\n\s+/ / ; ta ; P ; D' 2.txt
+     Subject: Hello World
+     Content-Type: multipart/alternative; boundary=94eb2c190cc6370f06054535da6a
+     Date: Tue, 3 Jan 2017 19:41:16 +0000 (GMT)
+     Authentication-Results: mx.gnu.org; dkim=pass header.i=@gnu.org; spf=pass
+     Message-ID: <abcdef@gnu.org>
+     From: John Doe <jdoe@gnu.org>
+     To: Jane Smith <jsmith@gnu.org>
+     # A portable (non-gnu) variation:
+     #   sed -e :a -e '$!N;s/\n  */ /;ta' -e 'P;D'
+File: sed.info,  Node: Centering lines,  Next: Increment a number,  Prev: Joining lines,  Up: Examples
+.2 Centering Lines
+===================
+This script centers all lines of a file on a 80 columns width.  To
+change that width, the number in â\{...\}â must be replaced, and the
+number of added spaces also must be changed.
+   Note how the buffer commands are used to separate parts in the
+regular expressions to be matchedâthis is a common technique.
+     #!/usr/bin/sed -f
+     # Put 80 spaces in the buffer
+{
+       x
+       s/^$/          /
+       s/^.*$/&&&&&&&&/
+       x
+     }
+     # delete leading and trailing spaces
+     y/<TAB>/ /
+     s/^ *//
+     s/ *$//
+     # add a newline and 80 spaces to end of line
+     G
+     # keep first 81 chars (80 + a newline)
+     s/^\(.\{81\}\).*$/\1/
+     # \2 matches half of the spaces, which are moved to the beginning
+     s/^\(.*\)\n\(.*\)\2/\2\1/
+File: sed.info,  Node: Increment a number,  Next: Rename files to lower case,  Prev: Centering lines,  Up: Examples
+.3 Increment a Number
+======================
+This script is one of a few that demonstrate how to do arithmetic in
+âsedâ.  This is indeed possible,(1) but must be done manually.
+   To increment one number you just add 1 to last digit, replacing it by
+the following digit.  There is one exception: when the digit is a nine
+the previous digits must be also incremented until you donât have a
+nine.
+   This solution by Bruno Haible is very clever and smart because it
+uses a single buffer; if you donât have this limitation, the algorithm
+used in *note Numbering lines: cat -n, is faster.  It works by replacing
+trailing nines with an underscore, then using multiple âsâ commands to
+increment the last digit, and then again substituting underscores with
+zeros.
+     #!/usr/bin/sed -f
+     /[^0-9]/ d
+     # replace all trailing 9s by _ (any other character except digits, could
+     # be used)
+     :d
+     s/9\(_*\)$/_\1/
+     td
+     # incr last digit only.  The first line adds a most-significant
+     # digit of 1 if we have to add a digit.
+     s/^\(_*\)$/1\1/; tn
+     s/8\(_*\)$/9\1/; tn
+     s/7\(_*\)$/8\1/; tn
+     s/6\(_*\)$/7\1/; tn
+     s/5\(_*\)$/6\1/; tn
+     s/4\(_*\)$/5\1/; tn
+     s/3\(_*\)$/4\1/; tn
+     s/2\(_*\)$/3\1/; tn
+     s/1\(_*\)$/2\1/; tn
+     s/0\(_*\)$/1\1/; tn
+     :n
+     y/_/0/
+   ---------- Footnotes ----------
+   (1) âsedâ guru Greg Ubben wrote an implementation of the âdcâ RPN
+calculator!  It is distributed together with sed.
+File: sed.info,  Node: Rename files to lower case,  Next: Print bash environment,  Prev: Increment a number,  Up: Examples
+.4 Rename Files to Lower Case
+==============================
+This is a pretty strange use of âsedâ.  We transform text, and transform
+it to be shell commands, then just feed them to shell.  Donât worry,
+even worse hacks are done when using âsedâ; I have seen a script
+converting the output of âdateâ into a âbcâ program!
+   The main body of this is the âsedâ script, which remaps the name from
+lower to upper (or vice-versa) and even checks out if the remapped name
+is the same as the original name.  Note how the script is parameterized
+using shell variables and proper quoting.
+     #! /bin/sh
+     # rename files to lower/upper case...
+     #
+     # usage:
+     #    move-to-lower *
+     #    move-to-upper *
+     # or
+     #    move-to-lower -R .
+     #    move-to-upper -R .
+     #
+     help()
+     {
+             cat << eof
+     Usage: $0 [-n] [-r] [-h] files...
+     -n      do nothing, only see what would be done
+     -R      recursive (use find)
+     -h      this message
+     files   files to remap to lower case
+     Examples:
+            $0 -n *        (see if everything is ok, then...)
+            $0 *
+            $0 -R .
+     eof
+     }
+     apply_cmd='sh'
+     finder='echo "$@" | tr " " "\n"'
+     files_only=
+     while :
+     do
+         case "$1" in
+             -n) apply_cmd='cat' ;;
+             -R) finder='find "$@" -type f';;
+             -h) help ; exit 1 ;;
+             *) break ;;
+         esac
+         shift
+     done
+     if [ -z "$1" ]; then
+             echo Usage: $0 [-h] [-n] [-r] files...
+             exit 1
+     fi
+     LOWER='abcdefghijklmnopqrstuvwxyz'
+     UPPER='ABCDEFGHIJKLMNOPQRSTUVWXYZ'
+     case `basename $0` in
+             *upper*) TO=$UPPER; FROM=$LOWER ;;
+             *)       FROM=$UPPER; TO=$LOWER ;;
+     esac
+     eval $finder | sed -n '
+     # remove all trailing slashes
+     s/\/*$//
+     # add ./ if there is no path, only a filename
+     /\//! s/^/.\//
+     # save path+filename
+     h
+     # remove path
+     s/.*\///
+     # do conversion only on filename
+     y/'$FROM'/'$TO'/
+     # now line contains original path+file, while
+     # hold space contains the new filename
+     x
+     # add converted file name to line, which now contains
+     # path/file-name\nconverted-file-name
+     G
+     # check if converted file name is equal to original file name,
+     # if it is, do not print anything
+     /^.*\/\(.*\)\n\1/b
+     # escape special characters for the shell
+     s/["$`\\]/\\&/g
+     # now, transform path/fromfile\n, into
+     # mv path/fromfile path/tofile and print it
+     s/^\(.*\/\)\(.*\)\n\(.*\)$/mv "\1\2" "\1\3"/p
+     ' | $apply_cmd
+File: sed.info,  Node: Print bash environment,  Next: Reverse chars of lines,  Prev: Rename files to lower case,  Up: Examples
+.5 Print âbashâ Environment
+============================
+This script strips the definition of the shell functions from the output
+of the âsetâ Bourne-shell command.
+     #!/bin/sh
+     set | sed -n '
+     :x
+     # if no occurrence of "=()" print and load next line
+     /=()/! { p; b; }
+     / () $/! { p; b; }
+     # possible start of functions section
+     # save the line in case this is a var like FOO="() "
+     h
+     # if the next line has a brace, we quit because
+     # nothing comes after functions
+     n
+     /^{/ q
+     # print the old line
+     x; p
+     # work on the new line now
+     x; bx
+     '
+File: sed.info,  Node: Reverse chars of lines,  Next: Text search across multiple lines,  Prev: Print bash environment,  Up: Examples
+.6 Reverse Characters of Lines
+===============================
+This script can be used to reverse the position of characters in lines.
+The technique moves two characters at a time, hence it is faster than
+more intuitive implementations.
+   Note the âtxâ command before the definition of the label.  This is
+often needed to reset the flag that is tested by the âtâ command.
+   Imaginative readers will find uses for this script.  An example is
+reversing the output of âbannerâ.(1)
+     #!/usr/bin/sed -f
+     /../! b
+     # Reverse a line.  Begin embedding the line between two newlines
+     s/^.*$/\
+     &\
+     /
+     # Move first character at the end.  The regexp matches until
+     # there are zero or one characters between the markers
+     tx
+     :x
+     s/\(\n.\)\(.*\)\(.\n\)/\3\2\1/
+     tx
+     # Remove the newline markers
+     s/\n//g
+   ---------- Footnotes ----------
+   (1) This requires another script to pad the output of banner; for
+example
+     #! /bin/sh
+     banner -w $1 $2 $3 $4 |
+       sed -e :a -e '/^.\{0,'$1'\}$/ { s/$/ /; ba; }' |
+       ~/sedscripts/reverseline.sed
+File: sed.info,  Node: Text search across multiple lines,  Next: Line length adjustment,  Prev: Reverse chars of lines,  Up: Examples
+.7 Text search across multiple lines
+=====================================
+This section uses âNâ and âDâ commands to search for consecutive words
+spanning multiple lines.  *Note Multiline techniques::.
+   These examples deal with finding doubled occurrences of words in a
+document.
+   Finding doubled words in a single line is easy using GNU âgrepâ and
+similarly with GNU âsedâ:
+     $ cat two-cities-dup1.txt
+     It was the best of times,
+     it was the worst of times,
+     it was the the age of wisdom,
+     it was the age of foolishness,
+     $ grep -E '\b(\w+)\s+\1\b' two-cities-dup1.txt
+     it was the the age of wisdom,
+     $ grep -n -E '\b(\w+)\s+\1\b' two-cities-dup1.txt
+:it was the the age of wisdom,
+     $ sed -En '/\b(\w+)\s+\1\b/p' two-cities-dup1.txt
+     it was the the age of wisdom,
+     $ sed -En '/\b(\w+)\s+\1\b/{=;p}' two-cities-dup1.txt
+     it was the the age of wisdom,
+   â¢ The regular expression â\b\w+\s+â searches for word-boundary
+     (â\bâ), followed by one-or-more word-characters (â\w+â), followed
+     by whitespace (â\s+â).  *Note regexp extensions::.
+   â¢ Adding parentheses around the â(\w+)â expression creates a
+     subexpression.  The regular expression pattern â(PATTERN)\s+\1â
+     defines a subexpression (in the parentheses) followed by a
+     back-reference, separated by whitespace.  A successful match means
+     the PATTERN was repeated twice in succession.  *Note
+     Back-references and Subexpressions::.
+   â¢ The word-boundery expression (â\bâ) at both ends ensures partial
+     words are not matched (e.g.  âthe thenâ is not a desired match).
+   â¢ The â-Eâ option enables extended regular expression syntax,
+     alleviating the need to add backslashes before the parenthesis.
+     *Note ERE syntax::.
+   When the doubled word span two lines the above regular expression
+will not find them as âgrepâ and âsedâ operate line-by-line.
+   By using âNâ and âDâ commands, âsedâ can apply regular expressions on
+multiple lines (that is, multiple lines are stored in the pattern space,
+and the regular expression works on it):
+     $ cat two-cities-dup2.txt
+     It was the best of times, it was the
+     worst of times, it was the
+     the age of wisdom,
+     it was the age of foolishness,
+     $ sed -En '{N; /\b(\w+)\s+\1\b/{=;p} ; D}'  two-cities-dup2.txt
+     worst of times, it was the
+     the age of wisdom,
+   â¢ The âNâ command appends the next line to the pattern space (thus
+     ensuring it contains two consecutive lines in every cycle).
+   â¢ The regular expression uses â\s+â for word separator which matches
+     both spaces and newlines.
+   â¢ The regular expression matches, the entire pattern space is printed
+     with âpâ.  No lines are printed by default due to the â-nâ option.
+   â¢ The âDâ removes the first line from the pattern space (up until the
+     first newline), readying it for the next cycle.
+   See the GNU âcoreutilsâ manual for an alternative solution using âtr
+-sâ and âuniqâ at
+<https://gnu.org/s/coreutils/manual/html_node/Squeezing-and-deleting.html>.
+File: sed.info,  Node: Line length adjustment,  Next: Adding a header to multiple files,  Prev: Text search across multiple lines,  Up: Examples
+.8 Line length adjustment
+==========================
+This section uses âNâ and âPâ commands to read and write lines, and the
+âbâ command for branching.  *Note Multiline techniques:: and *note
+Branching and flow control::.
+   This (somewhat contrived) example deal with formatting and wrapping
+lines of text of the following input file:
+     $ cat two-cities-mix.txt
+     It was the best of times, it was
+     the worst of times, it
+     was the age of
+     wisdom,
+     it
+     was
+     the age
+     of foolishness,
+The following sed program wraps lines at 40 characters:
+     $ cat wrap40.sed
+     # outer loop
+     :x
+     # Append a newline followed by the next input line to the pattern buffer
+     N
+     # Remove all newlines from the pattern buffer
+     s/\n/ /g
+     # Inner loop
+     :y
+     # Add a newline after the first 40 characters
+     s/(.{40,40})/\1\n/
+     # If there is a newline in the pattern buffer
+     # (i.e. the previous substitution added a newline)
+     /\n/ {
+         # There are newlines in the pattern buffer -
+         # print the content until the first newline.
+         P
+        # Remove the printed characters and the first newline
+        s/.*\n//
+        # branch to label 'y' - repeat inner loop
+        by
+      }
+     # No newlines in the pattern buffer - Branch to label 'x' (outer loop)
+     # and read the next input line
+     bx
+The wrapped output:
+     $ sed -E -f wrap40.sed two-cities-mix.txt
+     It was the best of times, it was the wor
+     st of times, it was the age of wisdom, i
+     t was the age of foolishness,
+File: sed.info,  Node: Adding a header to multiple files,  Next: tac,  Prev: Line length adjustment,  Up: Examples
+.9 Adding a header to multiple files
+=====================================
+GNU âsedâ can be used to safely modify multiple files at once.
+Add a single line to the beginning of source code files:
+     sed -i '1i/* Copyright (C) FOO BAR */' *.c
+Adding a few lines is possible using â\nâ in the text:
+     sed -i '1i/*\n * Copyright (C) FOO BAR\n * Created by Jane Doe\n */' *.c
+   To add multiple lines from another file, use â0rFILEâ.  A typical use
+case is adding a license notice header to all files:
+     ## Create the header file:
+     $ cat<<'EOF'>LIC.TXT
+     /*
+         Copyright (C) 1989-2021 FOO BAR
+         This program is free software; you can redistribute it and/or modify
+         it under the terms of the GNU General Public License as published by
+         the Free Software Foundation; either version 3, or (at your option)
+         any later version.
+         This program is distributed in the hope that it will be useful,
+         but WITHOUT ANY WARRANTY; without even the implied warranty of
+         MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+         GNU General Public License for more details.
+         You should have received a copy of the GNU General Public License
+         along with this program; If not, see <https://www.gnu.org/licenses/>.
+     */
+     EOF
+     ## Add the file at the beginning of all source code files:
+     $ sed -i '0rLIC.TXT' *.cpp *.h
+   With script files (e.g.  â.shâ,â.pyâ,â.plâ files) the license notice
+typically appears _after_ the first line (the âshebangâ â#!â line).  The
+â1rFILEâ command will add âFILEâ _after_ the first line:
+     ## Create the header file:
+     $ cat<<'EOF'>LIC.TXT
+     ##
+     ## Copyright (C) 1989-2021 FOO BAR
+     ##
+     ## This program is free software; you can redistribute it and/or modify
+     ## it under the terms of the GNU General Public License as published by
+     ## the Free Software Foundation; either version 3, or (at your option)
+     ## any later version.
+     ##
+     ## This program is distributed in the hope that it will be useful,
+     ## but WITHOUT ANY WARRANTY; without even the implied warranty of
+     ## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+     ## GNU General Public License for more details.
+     ##
+     ## You should have received a copy of the GNU General Public License
+     ## along with this program; If not, see <https://www.gnu.org/licenses/>.
+     ##
+     ##
+     EOF
+     ## Add the file at the beginning of all source code files:
+     $ sed -i '1rLIC.TXT' *.py *.sh
+   The above âsedâ commands can be combined with âfindâ to locate files
+in all subdirectories, âxargsâ to run additional commands on selected
+files and âgrepâ to filter out files that already contain a copyright
+notice:
+     find \( -iname '*.cpp' -o -iname '*.c' -o -iname '*.h' \) \
+         | xargs grep -Li copyright \
+         | xargs -r sed -i '0rLIC.TXT'
+Or a slightly safe version (handling files with spaces and newlines):
+     find \( -iname '*.cpp' -o -iname '*.c' -o -iname '*.h' \) -print0 \
+         | xargs -0 grep -Z -Li copyright \
+         | xargs -0 -r sed -i '0rLIC.TXT'
+   Note: using the â0â address with ârâ command requires GNU âsedâ
+version 4.9 or later.  *Note Zero Address::.
+File: sed.info,  Node: tac,  Next: cat -n,  Prev: Adding a header to multiple files,  Up: Examples
+.10 Reverse Lines of Files
+===========================
+This one begins a series of totally useless (yet interesting) scripts
+emulating various Unix commands.  This, in particular, is a âtacâ
+workalike.
+   Note that on implementations other than GNU âsedâ this script might
+easily overflow internal buffers.
+     #!/usr/bin/sed -nf
+     # reverse all lines of input, i.e. first line became last, ...
+     # from the second line, the buffer (which contains all previous lines)
+     # is *appended* to current line, so, the order will be reversed
+! G
+     # on the last line we're done -- print everything
+     $ p
+     # store everything on the buffer again
+     h
+File: sed.info,  Node: cat -n,  Next: cat -b,  Prev: tac,  Up: Examples
+.11 Numbering Lines
+====================
+This script replaces âcat -nâ; in fact it formats its output exactly
+like GNU âcatâ does.
+   Of course this is completely useless and for two reasons: first,
+because somebody else did it in C, second, because the following
+Bourne-shell script could be used for the same purpose and would be much
+faster:
+     #! /bin/sh
+     sed -e "=" $@ | sed -e '
+       s/^/      /
+       N
+       s/^ *\(......\)\n/\1  /
+     '
+   It uses âsedâ to print the line number, then groups lines two by two
+using âNâ.  Of course, this script does not teach as much as the one
+presented below.
+   The algorithm used for incrementing uses both buffers, so the line is
+printed as soon as possible and then discarded.  The number is split so
+that changing digits go in a buffer and unchanged ones go in the other;
+the changed digits are modified in a single step (using a âyâ command).
+The line number for the next line is then composed and stored in the
+hold space, to be used in the next iteration.
+     #!/usr/bin/sed -nf
+     # Prime the pump on the first line
+     x
+     /^$/ s/^.*$/1/
+     # Add the correct line number before the pattern
+     G
+     h
+     # Format it and print it
+     s/^/      /
+     s/^ *\(......\)\n/\1  /p
+     # Get the line number from hold space; add a zero
+     # if we're going to add a digit on the next line
+     g
+     s/\n.*$//
+     /^9*$/ s/^/0/
+     # separate changing/unchanged digits with an x
+     s/.9*$/x&/
+     # keep changing digits in hold space
+     h
+     s/^.*x//
+     y/0123456789/1234567890/
+     x
+     # keep unchanged digits in pattern space
+     s/x.*$//
+     # compose the new number, remove the newline implicitly added by G
+     G
+     s/\n//
+     h
+File: sed.info,  Node: cat -b,  Next: wc -c,  Prev: cat -n,  Up: Examples
+.12 Numbering Non-blank Lines
+==============================
+Emulating âcat -bâ is almost the same as âcat -nââwe only have to select
+which lines are to be numbered and which are not.
+   The part that is common to this script and the previous one is not
+commented to show how important it is to comment âsedâ scripts
+properly...
+     #!/usr/bin/sed -nf
+     /^$/ {
+       p
+       b
+     }
+     # Same as cat -n from now
+     x
+     /^$/ s/^.*$/1/
+     G
+     h
+     s/^/      /
+     s/^ *\(......\)\n/\1  /p
+     x
+     s/\n.*$//
+     /^9*$/ s/^/0/
+     s/.9*$/x&/
+     h
+     s/^.*x//
+     y/0123456789/1234567890/
+     x
+     s/x.*$//
+     G
+     s/\n//
+     h
+File: sed.info,  Node: wc -c,  Next: wc -w,  Prev: cat -b,  Up: Examples
+.13 Counting Characters
+========================
+This script shows another way to do arithmetic with âsedâ.  In this case
+we have to add possibly large numbers, so implementing this by
+successive increments would not be feasible (and possibly even more
+complicated to contrive than this script).
+   The approach is to map numbers to letters, kind of an abacus
+implemented with âsedâ.  âaâs are units, âbâs are tens and so on: we
+simply add the number of characters on the current line as units, and
+then propagate the carry to tens, hundreds, and so on.
+   As usual, running totals are kept in hold space.
+   On the last line, we convert the abacus form back to decimal.  For
+the sake of variety, this is done with a loop rather than with some 80
+âsâ commands(1): first we convert units, removing âaâs from the number;
+then we rotate letters so that tens become âaâs, and so on until no more
+letters remain.
+     #!/usr/bin/sed -nf
+     # Add n+1 a's to hold space (+1 is for the newline)
+     s/./a/g
+     H
+     x
+     s/\n/a/
+     # Do the carry.  The t's and b's are not necessary,
+     # but they do speed up the thing
+     t a
+     : a;  s/aaaaaaaaaa/b/g; t b; b done
+     : b;  s/bbbbbbbbbb/c/g; t c; b done
+     : c;  s/cccccccccc/d/g; t d; b done
+     : d;  s/dddddddddd/e/g; t e; b done
+     : e;  s/eeeeeeeeee/f/g; t f; b done
+     : f;  s/ffffffffff/g/g; t g; b done
+     : g;  s/gggggggggg/h/g; t h; b done
+     : h;  s/hhhhhhhhhh//g
+     : done
+     $! {
+       h
+       b
+     }
+     # On the last line, convert back to decimal
+     : loop
+     /a/! s/[b-h]*/&0/
+     s/aaaaaaaaa/9/
+     s/aaaaaaaa/8/
+     s/aaaaaaa/7/
+     s/aaaaaa/6/
+     s/aaaaa/5/
+     s/aaaa/4/
+     s/aaa/3/
+     s/aa/2/
+     s/a/1/
+     : next
+     y/bcdefgh/abcdefg/
+     /[a-h]/ b loop
+     p
+   ---------- Footnotes ----------
+   (1) Some implementations have a limit of 199 commands per script
+File: sed.info,  Node: wc -w,  Next: wc -l,  Prev: wc -c,  Up: Examples
+.14 Counting Words
+===================
+This script is almost the same as the previous one, once each of the
+words on the line is converted to a single âaâ (in the previous script
+each letter was changed to an âaâ).
+   It is interesting that real âwcâ programs have optimized loops for
+âwc -câ, so they are much slower at counting words rather than
+characters.  This scriptâs bottleneck, instead, is arithmetic, and hence
+the word-counting one is faster (it has to manage smaller numbers).
+   Again, the common parts are not commented to show the importance of
+commenting âsedâ scripts.
+     #!/usr/bin/sed -nf
+     # Convert words to a's
+     s/[ <TAB>][ <TAB>]*/ /g
+     s/^/ /
+     s/ [^ ][^ ]*/a /g
+     s/ //g
+     # Append them to hold space
+     H
+     x
+     s/\n//
+     # From here on it is the same as in wc -c.
+     /aaaaaaaaaa/! bx;   s/aaaaaaaaaa/b/g
+     /bbbbbbbbbb/! bx;   s/bbbbbbbbbb/c/g
+     /cccccccccc/! bx;   s/cccccccccc/d/g
+     /dddddddddd/! bx;   s/dddddddddd/e/g
+     /eeeeeeeeee/! bx;   s/eeeeeeeeee/f/g
+     /ffffffffff/! bx;   s/ffffffffff/g/g
+     /gggggggggg/! bx;   s/gggggggggg/h/g
+     s/hhhhhhhhhh//g
+     :x
+     $! { h; b; }
+     :y
+     /a/! s/[b-h]*/&0/
+     s/aaaaaaaaa/9/
+     s/aaaaaaaa/8/
+     s/aaaaaaa/7/
+     s/aaaaaa/6/
+     s/aaaaa/5/
+     s/aaaa/4/
+     s/aaa/3/
+     s/aa/2/
+     s/a/1/
+     y/bcdefgh/abcdefg/
+     /[a-h]/ by
+     p
+File: sed.info,  Node: wc -l,  Next: head,  Prev: wc -w,  Up: Examples
+.15 Counting Lines
+===================
+No strange things are done now, because âsedâ gives us âwc -lâ
+functionality for free!!!  Look:
+     #!/usr/bin/sed -nf
+     $=
+File: sed.info,  Node: head,  Next: tail,  Prev: wc -l,  Up: Examples
+.16 Printing the First Lines
+=============================
+This script is probably the simplest useful âsedâ script.  It displays
+the first 10 lines of input; the number of displayed lines is right
+before the âqâ command.
+     #!/usr/bin/sed -f
+q
+File: sed.info,  Node: tail,  Next: uniq,  Prev: head,  Up: Examples
+.17 Printing the Last Lines
+============================
+Printing the last N lines rather than the first is more complex but
+indeed possible.  N is encoded in the second line, before the bang
+character.
+   This script is similar to the âtacâ script in that it keeps the final
+output in the hold space and prints it at the end:
+     #!/usr/bin/sed -nf
+! {; H; g; }
+,10 !s/[^\n]*\n//
+     $p
+     h
+   Mainly, the scripts keeps a window of 10 lines and slides it by
+adding a line and deleting the oldest (the substitution command on the
+second line works like a âDâ command but does not restart the loop).
+   The âsliding windowâ technique is a very powerful way to write
+efficient and complex âsedâ scripts, because commands like âPâ would
+require a lot of work if implemented manually.
+   To introduce the technique, which is fully demonstrated in the rest
+of this chapter and is based on the âNâ, âPâ and âDâ commands, here is
+an implementation of âtailâ using a simple âsliding window.â
+   This looks complicated but in fact the working is the same as the
+last script: after we have kicked in the appropriate number of lines,
+however, we stop using the hold space to keep inter-line state, and
+instead use âNâ and âDâ to slide pattern space by one line:
+     #!/usr/bin/sed -f
+h
+,10 {; H; g; }
+     $q
+,9d
+     N
+     D
+   Note how the first, second and fourth line are inactive after the
+first ten lines of input.  After that, all the script does is: exiting
+on the last line of input, appending the next input line to pattern
+space, and removing the first line.
+File: sed.info,  Node: uniq,  Next: uniq -d,  Prev: tail,  Up: Examples
+.18 Make Duplicate Lines Unique
+================================
+This is an example of the art of using the âNâ, âPâ and âDâ commands,
+probably the most difficult to master.
+     #!/usr/bin/sed -f
+     h
+     :b
+     # On the last line, print and exit
+     $b
+     N
+     /^\(.*\)\n\1$/ {
+         # The two lines are identical.  Undo the effect of
+         # the n command.
+         g
+         bb
+     }
+     # If the N command had added the last line, print and exit
+     $b
+     # The lines are different; print the first and go
+     # back working on the second.
+     P
+     D
+   As you can see, we maintain a 2-line window using âPâ and âDâ.  This
+technique is often used in advanced âsedâ scripts.
+File: sed.info,  Node: uniq -d,  Next: uniq -u,  Prev: uniq,  Up: Examples
+.19 Print Duplicated Lines of Input
+====================================
+This script prints only duplicated lines, like âuniq -dâ.
+     #!/usr/bin/sed -nf
+     $b
+     N
+     /^\(.*\)\n\1$/ {
+         # Print the first of the duplicated lines
+         s/.*\n//
+         p
+         # Loop until we get a different line
+         :b
+         $b
+         N
+         /^\(.*\)\n\1$/ {
+             s/.*\n//
+             bb
+         }
+     }
+     # The last line cannot be followed by duplicates
+     $b
+     # Found a different one.  Leave it alone in the pattern space
+     # and go back to the top, hunting its duplicates
+     D
+File: sed.info,  Node: uniq -u,  Next: cat -s,  Prev: uniq -d,  Up: Examples
+.20 Remove All Duplicated Lines
+================================
+This script prints only unique lines, like âuniq -uâ.
+     #!/usr/bin/sed -f
+     # Search for a duplicate line --- until that, print what you find.
+     $b
+     N
+     /^\(.*\)\n\1$/ ! {
+         P
+         D
+     }
+     :c
+     # Got two equal lines in pattern space.  At the
+     # end of the file we simply exit
+     $d
+     # Else, we keep reading lines with N until we
+     # find a different one
+     s/.*\n//
+     N
+     /^\(.*\)\n\1$/ {
+         bc
+     }
+     # Remove the last instance of the duplicate line
+     # and go back to the top
+     D
+File: sed.info,  Node: cat -s,  Prev: uniq -u,  Up: Examples
+.21 Squeezing Blank Lines
+==========================
+As a final example, here are three scripts, of increasing complexity and
+speed, that implement the same function as âcat -sâ, that is squeezing
+blank lines.
+   The first leaves a blank line at the beginning and end if there are
+some already.
+     #!/usr/bin/sed -f
+     # on empty lines, join with next
+     # Note there is a star in the regexp
+     :x
+     /^\n*$/ {
+     N
+     bx
+     }
+     # now, squeeze all '\n', this can be also done by:
+     # s/^\(\n\)*/\1/
+     s/\n*/\
+     /
+   This one is a bit more complex and removes all empty lines at the
+beginning.  It does leave a single blank line at end if one was there.
+     #!/usr/bin/sed -f
+     # delete all leading empty lines
+,/^./{
+     /./!d
+     }
+     # on an empty line we remove it and all the following
+     # empty lines, but one
+     :x
+     /./!{
+     N
+     s/^\n$//
+     tx
+     }
+   This removes leading and trailing blank lines.  It is also the
+fastest.  Note that loops are completely done with ânâ and âbâ, without
+relying on âsedâ to restart the script automatically at the end of a
+line.
+     #!/usr/bin/sed -nf
+     # delete all (leading) blanks
+     /./!d
+     # get here: so there is a non empty
+     :x
+     # print it
+     p
+     # get next
+     n
+     # got chars? print it again, etc...
+     /./bx
+     # no, don't have chars: got an empty line
+     :z
+     # get next, if last line we finish here so no trailing
+     # empty lines are written
+     n
+     # also empty? then ignore it, and get next... this will
+     # remove ALL empty lines
+     /./!bz
+     # all empty lines were deleted/ignored, but we have a non empty.  As
+     # what we want to do is to squeeze, insert a blank line artificially
+     i\
+     bx
+File: sed.info,  Node: Limitations,  Next: Other Resources,  Prev: Examples,  Up: Top
+GNU âsedââs Limitations and Non-limitations
+*********************************************
+For those who want to write portable âsedâ scripts, be aware that some
+implementations have been known to limit line lengths (for the pattern
+and hold spaces) to be no more than 4000 bytes.  The POSIX standard
+specifies that conforming âsedâ implementations shall support at least
+byte line lengths.  GNU âsedâ has no built-in limit on line length;
+as long as it can âmalloc()â more (virtual) memory, you can feed or
+construct lines as long as you like.
+   However, recursion is used to handle subpatterns and indefinite
+repetition.  This means that the available stack space may limit the
+size of the buffer that can be processed by certain patterns.
+File: sed.info,  Node: Other Resources,  Next: Reporting Bugs,  Prev: Limitations,  Up: Top
+Other Resources for Learning About âsedâ
+******************************************
+For up to date information about GNU âsedâ please visit
+<https://www.gnu.org/software/sed/>.
+   Send general questions and suggestions to <sed-devel@gnu.org>.  Visit
+the mailing list archives for past discussions at
+<https://lists.gnu.org/archive/html/sed-devel/>.
+   The following resources provide information about âsedâ (both GNU
+âsedâ and other variations).  Note these not maintained by GNU âsedâ
+developers.
+   â¢ sed â$HOMEâ: <http://sed.sf.net>
+   â¢ sed FAQ: <http://sed.sf.net/sedfaq.html>
+   â¢ sederâs grabbag: <http://sed.sf.net/grabbag>
+   â¢ The âsed-usersâ mailing list maintained by Sven Guckes:
+     <http://groups.yahoo.com/group/sed-users/> (note this is _not_ the
+     GNU âsedâ mailing list).
+File: sed.info,  Node: Reporting Bugs,  Next: GNU Free Documentation License,  Prev: Other Resources,  Up: Top
+Reporting Bugs
+*****************
+Email bug reports to <bug-sed@gnu.org>.  Also, please include the output
+of âsed --versionâ in the body of your report if at all possible.
+   Please do not send a bug report like this:
+     while building frobme-1.3.4
+     $ configure
+     errorâ sed: file sedscr line 1: Unknown option to 's'
+   If GNU âsedâ doesnât configure your favorite package, take a few
+extra minutes to identify the specific problem and make a stand-alone
+test case.  Unlike other programs such as C compilers, making such test
+cases for âsedâ is quite simple.
+   A stand-alone test case includes all the data necessary to perform
+the test, and the specific invocation of âsedâ that causes the problem.
+The smaller a stand-alone test case is, the better.  A test case should
+not involve something as far removed from âsedâ as âtry to configure
+frobme-1.3.4â.  Yes, that is in principle enough information to look for
+the bug, but that is not a very practical prospect.
+   Here are a few commonly reported bugs that are not bugs.
+âNâ command on the last line
+     Most versions of âsedâ exit without printing anything when the âNâ
+     command is issued on the last line of a file.  GNU âsedâ prints
+     pattern space before exiting unless of course the â-nâ command
+     switch has been specified.  This choice is by design.
+     Default behavior (gnu extension, non-POSIX conforming):
+          $ seq 3 | sed N
+     To force POSIX-conforming behavior:
+          $ seq 3 | sed --posix N
+     For example, the behavior of
+          sed N foo bar
+     would depend on whether foo has an even or an odd number of
+     lines(1).  Or, when writing a script to read the next few lines
+     following a pattern match, traditional implementations of âsedâ
+     would force you to write something like
+          /foo/{ $!N; $!N; $!N; $!N; $!N; $!N; $!N; $!N; $!N }
+     instead of just
+          /foo/{ N;N;N;N;N;N;N;N;N; }
+     In any case, the simplest workaround is to use â$d;Nâ in scripts
+     that rely on the traditional behavior, or to set the
+     âPOSIXLY_CORRECTâ variable to a non-empty value.
+Regex syntax clashes (problems with backslashes)
+     âsedâ uses the POSIX basic regular expression syntax.  According to
+     the standard, the meaning of some escape sequences is undefined in
+     this syntax; notable in the case of âsedâ are â\|â, â\+â, â\?â,
+     â\`â, â\'â, â\<â, â\>â, â\bâ, â\Bâ, â\wâ, and â\Wâ.
+     As in all GNU programs that use POSIX basic regular expressions,
+     âsedâ interprets these escape sequences as special characters.  So,
+     âx\+â matches one or more occurrences of âxâ.  âabc\|defâ matches
+     either âabcâ or âdefâ.
+     This syntax may cause problems when running scripts written for
+     other âsedâs.  Some âsedâ programs have been written with the
+     assumption that â\|â and â\+â match the literal characters â|â and
+     â+â.  Such scripts must be modified by removing the spurious
+     backslashes if they are to be used with modern implementations of
+     âsedâ, like GNU âsedâ.
+     On the other hand, some scripts use s|abc\|def||g to remove
+     occurrences of _either_ âabcâ or âdefâ.  While this worked until
+     âsedâ 4.0.x, newer versions interpret this as removing the string
+     âabc|defâ.  This is again undefined behavior according to POSIX,
+     and this interpretation is arguably more robust: older âsedâs, for
+     example, required that the regex matcher parsed â\/â as â/â in the
+     common case of escaping a slash, which is again undefined behavior;
+     the new behavior avoids this, and this is good because the regex
+     matcher is only partially under our control.
+     In addition, this version of âsedâ supports several escape
+     characters (some of which are multi-character) to insert
+     non-printable characters in scripts (â\aâ, â\câ, â\dâ, â\oâ, â\râ,
+     â\tâ, â\vâ, â\xâ).  These can cause similar problems with scripts
+     written for other âsedâs.
+â-iâ clobbers read-only files
+     In short, âsed -iâ will let you delete the contents of a read-only
+     file, and in general the â-iâ option (*note Invocation: Invoking
+     sed.) lets you clobber protected files.  This is not a bug, but
+     rather a consequence of how the Unix file system works.
+     The permissions on a file say what can happen to the data in that
+     file, while the permissions on a directory say what can happen to
+     the list of files in that directory.  âsed -iâ will not ever open
+     for writing a file that is already on disk.  Rather, it will work
+     on a temporary file that is finally renamed to the original name:
+     if you rename or delete files, youâre actually modifying the
+     contents of the directory, so the operation depends on the
+     permissions of the directory, not of the file.  For this same
+     reason, âsedâ does not let you use â-iâ on a writable file in a
+     read-only directory, and will break hard or symbolic links when
+     â-iâ is used on such a file.
+â0aâ does not work (gives an error)
+     There is no line 0.  0 is a special address that is only used to
+     treat addresses like â0,/RE/â as active when the script starts: if
+     you write â1,/abc/dâ and the first line includes the string âabcâ,
+     then that match would be ignored because address ranges must span
+     at least two lines (barring the end of the file); but what you
+     probably wanted is to delete every line up to the first one
+     including âabcâ, and this is obtained with â0,/abc/dâ.
+â[a-z]â is case insensitive
+     You are encountering problems with locales.  POSIX mandates that
+     â[a-z]â uses the current localeâs collation order â in C parlance,
+     that means using âstrcoll(3)â instead of âstrcmp(3)â.  Some locales
+     have a case-insensitive collation order, others donât.
+     Another problem is that â[a-z]â tries to use collation symbols.
+     This only happens if you are on the GNU system, using GNU libcâs
+     regular expression matcher instead of compiling the one supplied
+     with GNU sed.  In a Danish locale, for example, the regular
+     expression â^[a-z]$â matches the string âaaâ, because this is a
+     single collating symbol that comes after âaâ and before âbâ; âllâ
+     behaves similarly in Spanish locales, or âijâ in Dutch locales.
+     To work around these problems, which may cause bugs in shell
+     scripts, set the âLC_COLLATEâ and âLC_CTYPEâ environment variables
+     to âCâ.
+âs/.*//â does not clear pattern space
+     This happens if your input stream includes invalid multibyte
+     sequences.  POSIX mandates that such sequences are _not_ matched by
+     â.â, so that âs/.*//â will not clear pattern space as you would
+     expect.  In fact, there is no way to clear sedâs buffers in the
+     middle of the script in most multibyte locales (including UTF-8
+     locales).  For this reason, GNU âsedâ provides a âzâ command (for
+     âzapâ) as an extension.
+     To work around these problems, which may cause bugs in shell
+     scripts, set the âLC_COLLATEâ and âLC_CTYPEâ environment variables
+     to âCâ.
+   ---------- Footnotes ----------
+   (1) which is the actual âbugâ that prompted the change in behavior
+File: sed.info,  Node: GNU Free Documentation License,  Next: Concept Index,  Prev: Reporting Bugs,  Up: Top
+Appendix A GNU Free Documentation License
+*****************************************
+                     Version 1.3, 3 November 2008
+     Copyright Â© 2000, 2001, 2002, 2007, 2008 Free Software Foundation, Inc.
+     <https://fsf.org/>
+     Everyone is permitted to copy and distribute verbatim copies
+     of this license document, but changing it is not allowed.
+. PREAMBLE
+     The purpose of this License is to make a manual, textbook, or other
+     functional and useful document âfreeâ in the sense of freedom: to
+     assure everyone the effective freedom to copy and redistribute it,
+     with or without modifying it, either commercially or
+     noncommercially.  Secondarily, this License preserves for the
+     author and publisher a way to get credit for their work, while not
+     being considered responsible for modifications made by others.
+     This License is a kind of âcopyleftâ, which means that derivative
+     works of the document must themselves be free in the same sense.
+     It complements the GNU General Public License, which is a copyleft
+     license designed for free software.
+     We have designed this License in order to use it for manuals for
+     free software, because free software needs free documentation: a
+     free program should come with manuals providing the same freedoms
+     that the software does.  But this License is not limited to
+     software manuals; it can be used for any textual work, regardless
+     of subject matter or whether it is published as a printed book.  We
+     recommend this License principally for works whose purpose is
+     instruction or reference.
+. APPLICABILITY AND DEFINITIONS
+     This License applies to any manual or other work, in any medium,
+     that contains a notice placed by the copyright holder saying it can
+     be distributed under the terms of this License.  Such a notice
+     grants a world-wide, royalty-free license, unlimited in duration,
+     to use that work under the conditions stated herein.  The
+     âDocumentâ, below, refers to any such manual or work.  Any member
+     of the public is a licensee, and is addressed as âyouâ.  You accept
+     the license if you copy, modify or distribute the work in a way
+     requiring permission under copyright law.
+     A âModified Versionâ of the Document means any work containing the
+     Document or a portion of it, either copied verbatim, or with
+     modifications and/or translated into another language.
+     A âSecondary Sectionâ is a named appendix or a front-matter section
+     of the Document that deals exclusively with the relationship of the
+     publishers or authors of the Document to the Documentâs overall
+     subject (or to related matters) and contains nothing that could
+     fall directly within that overall subject.  (Thus, if the Document
+     is in part a textbook of mathematics, a Secondary Section may not
+     explain any mathematics.)  The relationship could be a matter of
+     historical connection with the subject or with related matters, or
+     of legal, commercial, philosophical, ethical or political position
+     regarding them.
+     The âInvariant Sectionsâ are certain Secondary Sections whose
+     titles are designated, as being those of Invariant Sections, in the
+     notice that says that the Document is released under this License.
+     If a section does not fit the above definition of Secondary then it
+     is not allowed to be designated as Invariant.  The Document may
+     contain zero Invariant Sections.  If the Document does not identify
+     any Invariant Sections then there are none.
+     The âCover Textsâ are certain short passages of text that are
+     listed, as Front-Cover Texts or Back-Cover Texts, in the notice
+     that says that the Document is released under this License.  A
+     Front-Cover Text may be at most 5 words, and a Back-Cover Text may
+     be at most 25 words.
+     A âTransparentâ copy of the Document means a machine-readable copy,
+     represented in a format whose specification is available to the
+     general public, that is suitable for revising the document
+     straightforwardly with generic text editors or (for images composed
+     of pixels) generic paint programs or (for drawings) some widely
+     available drawing editor, and that is suitable for input to text
+     formatters or for automatic translation to a variety of formats
+     suitable for input to text formatters.  A copy made in an otherwise
+     Transparent file format whose markup, or absence of markup, has
+     been arranged to thwart or discourage subsequent modification by
+     readers is not Transparent.  An image format is not Transparent if
+     used for any substantial amount of text.  A copy that is not
+     âTransparentâ is called âOpaqueâ.
+     Examples of suitable formats for Transparent copies include plain
+     ASCII without markup, Texinfo input format, LaTeX input format,
+     SGML or XML using a publicly available DTD, and standard-conforming
+     simple HTML, PostScript or PDF designed for human modification.
+     Examples of transparent image formats include PNG, XCF and JPG.
+     Opaque formats include proprietary formats that can be read and
+     edited only by proprietary word processors, SGML or XML for which
+     the DTD and/or processing tools are not generally available, and
+     the machine-generated HTML, PostScript or PDF produced by some word
+     processors for output purposes only.
+     The âTitle Pageâ means, for a printed book, the title page itself,
+     plus such following pages as are needed to hold, legibly, the
+     material this License requires to appear in the title page.  For
+     works in formats which do not have any title page as such, âTitle
+     Pageâ means the text near the most prominent appearance of the
+     workâs title, preceding the beginning of the body of the text.
+     The âpublisherâ means any person or entity that distributes copies
+     of the Document to the public.
+     A section âEntitled XYZâ means a named subunit of the Document
+     whose title either is precisely XYZ or contains XYZ in parentheses
+     following text that translates XYZ in another language.  (Here XYZ
+     stands for a specific section name mentioned below, such as
+     âAcknowledgementsâ, âDedicationsâ, âEndorsementsâ, or âHistoryâ.)
+     To âPreserve the Titleâ of such a section when you modify the
+     Document means that it remains a section âEntitled XYZâ according
+     to this definition.
+     The Document may include Warranty Disclaimers next to the notice
+     which states that this License applies to the Document.  These
+     Warranty Disclaimers are considered to be included by reference in
+     this License, but only as regards disclaiming warranties: any other
+     implication that these Warranty Disclaimers may have is void and
+     has no effect on the meaning of this License.
+. VERBATIM COPYING
+     You may copy and distribute the Document in any medium, either
+     commercially or noncommercially, provided that this License, the
+     copyright notices, and the license notice saying this License
+     applies to the Document are reproduced in all copies, and that you
+     add no other conditions whatsoever to those of this License.  You
+     may not use technical measures to obstruct or control the reading
+     or further copying of the copies you make or distribute.  However,
+     you may accept compensation in exchange for copies.  If you
+     distribute a large enough number of copies you must also follow the
+     conditions in section 3.
+     You may also lend copies, under the same conditions stated above,
+     and you may publicly display copies.
+. COPYING IN QUANTITY
+     If you publish printed copies (or copies in media that commonly
+     have printed covers) of the Document, numbering more than 100, and
+     the Documentâs license notice requires Cover Texts, you must
+     enclose the copies in covers that carry, clearly and legibly, all
+     these Cover Texts: Front-Cover Texts on the front cover, and
+     Back-Cover Texts on the back cover.  Both covers must also clearly
+     and legibly identify you as the publisher of these copies.  The
+     front cover must present the full title with all words of the title
+     equally prominent and visible.  You may add other material on the
+     covers in addition.  Copying with changes limited to the covers, as
+     long as they preserve the title of the Document and satisfy these
+     conditions, can be treated as verbatim copying in other respects.
+     If the required texts for either cover are too voluminous to fit
+     legibly, you should put the first ones listed (as many as fit
+     reasonably) on the actual cover, and continue the rest onto
+     adjacent pages.
+     If you publish or distribute Opaque copies of the Document
+     numbering more than 100, you must either include a machine-readable
+     Transparent copy along with each Opaque copy, or state in or with
+     each Opaque copy a computer-network location from which the general
+     network-using public has access to download using public-standard
+     network protocols a complete Transparent copy of the Document, free
+     of added material.  If you use the latter option, you must take
+     reasonably prudent steps, when you begin distribution of Opaque
+     copies in quantity, to ensure that this Transparent copy will
+     remain thus accessible at the stated location until at least one
+     year after the last time you distribute an Opaque copy (directly or
+     through your agents or retailers) of that edition to the public.
+     It is requested, but not required, that you contact the authors of
+     the Document well before redistributing any large number of copies,
+     to give them a chance to provide you with an updated version of the
+     Document.
+. MODIFICATIONS
+     You may copy and distribute a Modified Version of the Document
+     under the conditions of sections 2 and 3 above, provided that you
+     release the Modified Version under precisely this License, with the
+     Modified Version filling the role of the Document, thus licensing
+     distribution and modification of the Modified Version to whoever
+     possesses a copy of it.  In addition, you must do these things in
+     the Modified Version:
+       A. Use in the Title Page (and on the covers, if any) a title
+          distinct from that of the Document, and from those of previous
+          versions (which should, if there were any, be listed in the
+          History section of the Document).  You may use the same title
+          as a previous version if the original publisher of that
+          version gives permission.
+       B. List on the Title Page, as authors, one or more persons or
+          entities responsible for authorship of the modifications in
+          the Modified Version, together with at least five of the
+          principal authors of the Document (all of its principal
+          authors, if it has fewer than five), unless they release you
+          from this requirement.
+       C. State on the Title page the name of the publisher of the
+          Modified Version, as the publisher.
+       D. Preserve all the copyright notices of the Document.
+       E. Add an appropriate copyright notice for your modifications
+          adjacent to the other copyright notices.
+       F. Include, immediately after the copyright notices, a license
+          notice giving the public permission to use the Modified
+          Version under the terms of this License, in the form shown in
+          the Addendum below.
+       G. Preserve in that license notice the full lists of Invariant
+          Sections and required Cover Texts given in the Documentâs
+          license notice.
+       H. Include an unaltered copy of this License.
+       I. Preserve the section Entitled âHistoryâ, Preserve its Title,
+          and add to it an item stating at least the title, year, new
+          authors, and publisher of the Modified Version as given on the
+          Title Page.  If there is no section Entitled âHistoryâ in the
+          Document, create one stating the title, year, authors, and
+          publisher of the Document as given on its Title Page, then add
+          an item describing the Modified Version as stated in the
+          previous sentence.
+       J. Preserve the network location, if any, given in the Document
+          for public access to a Transparent copy of the Document, and
+          likewise the network locations given in the Document for
+          previous versions it was based on.  These may be placed in the
+          âHistoryâ section.  You may omit a network location for a work
+          that was published at least four years before the Document
+          itself, or if the original publisher of the version it refers
+          to gives permission.
+       K. For any section Entitled âAcknowledgementsâ or âDedicationsâ,
+          Preserve the Title of the section, and preserve in the section
+          all the substance and tone of each of the contributor
+          acknowledgements and/or dedications given therein.
+       L. Preserve all the Invariant Sections of the Document, unaltered
+          in their text and in their titles.  Section numbers or the
+          equivalent are not considered part of the section titles.
+       M. Delete any section Entitled âEndorsementsâ.  Such a section
+          may not be included in the Modified Version.
+       N. Do not retitle any existing section to be Entitled
+          âEndorsementsâ or to conflict in title with any Invariant
+          Section.
+       O. Preserve any Warranty Disclaimers.
+     If the Modified Version includes new front-matter sections or
+     appendices that qualify as Secondary Sections and contain no
+     material copied from the Document, you may at your option designate
+     some or all of these sections as invariant.  To do this, add their
+     titles to the list of Invariant Sections in the Modified Versionâs
+     license notice.  These titles must be distinct from any other
+     section titles.
+     You may add a section Entitled âEndorsementsâ, provided it contains
+     nothing but endorsements of your Modified Version by various
+     partiesâfor example, statements of peer review or that the text has
+     been approved by an organization as the authoritative definition of
+     a standard.
+     You may add a passage of up to five words as a Front-Cover Text,
+     and a passage of up to 25 words as a Back-Cover Text, to the end of
+     the list of Cover Texts in the Modified Version.  Only one passage
+     of Front-Cover Text and one of Back-Cover Text may be added by (or
+     through arrangements made by) any one entity.  If the Document
+     already includes a cover text for the same cover, previously added
+     by you or by arrangement made by the same entity you are acting on
+     behalf of, you may not add another; but you may replace the old
+     one, on explicit permission from the previous publisher that added
+     the old one.
+     The author(s) and publisher(s) of the Document do not by this
+     License give permission to use their names for publicity for or to
+     assert or imply endorsement of any Modified Version.
+. COMBINING DOCUMENTS
+     You may combine the Document with other documents released under
+     this License, under the terms defined in section 4 above for
+     modified versions, provided that you include in the combination all
+     of the Invariant Sections of all of the original documents,
+     unmodified, and list them all as Invariant Sections of your
+     combined work in its license notice, and that you preserve all
+     their Warranty Disclaimers.
+     The combined work need only contain one copy of this License, and
+     multiple identical Invariant Sections may be replaced with a single
+     copy.  If there are multiple Invariant Sections with the same name
+     but different contents, make the title of each such section unique
+     by adding at the end of it, in parentheses, the name of the
+     original author or publisher of that section if known, or else a
+     unique number.  Make the same adjustment to the section titles in
+     the list of Invariant Sections in the license notice of the
+     combined work.
+     In the combination, you must combine any sections Entitled
+     âHistoryâ in the various original documents, forming one section
+     Entitled âHistoryâ; likewise combine any sections Entitled
+     âAcknowledgementsâ, and any sections Entitled âDedicationsâ.  You
+     must delete all sections Entitled âEndorsements.â
+. COLLECTIONS OF DOCUMENTS
+     You may make a collection consisting of the Document and other
+     documents released under this License, and replace the individual
+     copies of this License in the various documents with a single copy
+     that is included in the collection, provided that you follow the
+     rules of this License for verbatim copying of each of the documents
+     in all other respects.
+     You may extract a single document from such a collection, and
+     distribute it individually under this License, provided you insert
+     a copy of this License into the extracted document, and follow this
+     License in all other respects regarding verbatim copying of that
+     document.
+. AGGREGATION WITH INDEPENDENT WORKS
+     A compilation of the Document or its derivatives with other
+     separate and independent documents or works, in or on a volume of a
+     storage or distribution medium, is called an âaggregateâ if the
+     copyright resulting from the compilation is not used to limit the
+     legal rights of the compilationâs users beyond what the individual
+     works permit.  When the Document is included in an aggregate, this
+     License does not apply to the other works in the aggregate which
+     are not themselves derivative works of the Document.
+     If the Cover Text requirement of section 3 is applicable to these
+     copies of the Document, then if the Document is less than one half
+     of the entire aggregate, the Documentâs Cover Texts may be placed
+     on covers that bracket the Document within the aggregate, or the
+     electronic equivalent of covers if the Document is in electronic
+     form.  Otherwise they must appear on printed covers that bracket
+     the whole aggregate.
+. TRANSLATION
+     Translation is considered a kind of modification, so you may
+     distribute translations of the Document under the terms of section
+.  Replacing Invariant Sections with translations requires special
+     permission from their copyright holders, but you may include
+     translations of some or all Invariant Sections in addition to the
+     original versions of these Invariant Sections.  You may include a
+     translation of this License, and all the license notices in the
+     Document, and any Warranty Disclaimers, provided that you also
+     include the original English version of this License and the
+     original versions of those notices and disclaimers.  In case of a
+     disagreement between the translation and the original version of
+     this License or a notice or disclaimer, the original version will
+     prevail.
+     If a section in the Document is Entitled âAcknowledgementsâ,
+     âDedicationsâ, or âHistoryâ, the requirement (section 4) to
+     Preserve its Title (section 1) will typically require changing the
+     actual title.
+. TERMINATION
+     You may not copy, modify, sublicense, or distribute the Document
+     except as expressly provided under this License.  Any attempt
+     otherwise to copy, modify, sublicense, or distribute it is void,
+     and will automatically terminate your rights under this License.
+     However, if you cease all violation of this License, then your
+     license from a particular copyright holder is reinstated (a)
+     provisionally, unless and until the copyright holder explicitly and
+     finally terminates your license, and (b) permanently, if the
+     copyright holder fails to notify you of the violation by some
+     reasonable means prior to 60 days after the cessation.
+     Moreover, your license from a particular copyright holder is
+     reinstated permanently if the copyright holder notifies you of the
+     violation by some reasonable means, this is the first time you have
+     received notice of violation of this License (for any work) from
+     that copyright holder, and you cure the violation prior to 30 days
+     after your receipt of the notice.
+     Termination of your rights under this section does not terminate
+     the licenses of parties who have received copies or rights from you
+     under this License.  If your rights have been terminated and not
+     permanently reinstated, receipt of a copy of some or all of the
+     same material does not give you any rights to use it.
+. FUTURE REVISIONS OF THIS LICENSE
+     The Free Software Foundation may publish new, revised versions of
+     the GNU Free Documentation License from time to time.  Such new
+     versions will be similar in spirit to the present version, but may
+     differ in detail to address new problems or concerns.  See
+     <https://www.gnu.org/licenses/>.
+     Each version of the License is given a distinguishing version
+     number.  If the Document specifies that a particular numbered
+     version of this License âor any later versionâ applies to it, you
+     have the option of following the terms and conditions either of
+     that specified version or of any later version that has been
+     published (not as a draft) by the Free Software Foundation.  If the
+     Document does not specify a version number of this License, you may
+     choose any version ever published (not as a draft) by the Free
+     Software Foundation.  If the Document specifies that a proxy can
+     decide which future versions of this License can be used, that
+     proxyâs public statement of acceptance of a version permanently
+     authorizes you to choose that version for the Document.
+. RELICENSING
+     âMassive Multiauthor Collaboration Siteâ (or âMMC Siteâ) means any
+     World Wide Web server that publishes copyrightable works and also
+     provides prominent facilities for anybody to edit those works.  A
+     public wiki that anybody can edit is an example of such a server.
+     A âMassive Multiauthor Collaborationâ (or âMMCâ) contained in the
+     site means any set of copyrightable works thus published on the MMC
+     site.
+     âCC-BY-SAâ means the Creative Commons Attribution-Share Alike 3.0
+     license published by Creative Commons Corporation, a not-for-profit
+     corporation with a principal place of business in San Francisco,
+     California, as well as future copyleft versions of that license
+     published by that same organization.
+     âIncorporateâ means to publish or republish a Document, in whole or
+     in part, as part of another Document.
+     An MMC is âeligible for relicensingâ if it is licensed under this
+     License, and if all works that were first published under this
+     License somewhere other than this MMC, and subsequently
+     incorporated in whole or in part into the MMC, (1) had no cover
+     texts or invariant sections, and (2) were thus incorporated prior
+     to November 1, 2008.
+     The operator of an MMC Site may republish an MMC contained in the
+     site under CC-BY-SA on the same site at any time before August 1,
+, provided the MMC is eligible for relicensing.
+ADDENDUM: How to use this License for your documents
+====================================================
+To use this License in a document you have written, include a copy of
+the License in the document and put the following copyright and license
+notices just after the title page:
+       Copyright (C)  YEAR  YOUR NAME.
+       Permission is granted to copy, distribute and/or modify this document
+       under the terms of the GNU Free Documentation License, Version 1.3
+       or any later version published by the Free Software Foundation;
+       with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
+       Texts.  A copy of the license is included in the section entitled ``GNU
+       Free Documentation License''.
+   If you have Invariant Sections, Front-Cover Texts and Back-Cover
+Texts, replace the âwith...Texts.â line with this:
+         with the Invariant Sections being LIST THEIR TITLES, with
+         the Front-Cover Texts being LIST, and with the Back-Cover Texts
+         being LIST.
+   If you have Invariant Sections without Cover Texts, or some other
+combination of the three, merge those two alternatives to suit the
+situation.
+   If your document contains nontrivial examples of program code, we
+recommend releasing these examples in parallel under your choice of free
+software license, such as the GNU General Public License, to permit
+their use in free software.
+File: sed.info,  Node: Concept Index,  Next: Command and Option Index,  Prev: GNU Free Documentation License,  Up: Top
+Concept Index
+*************
+This is a general index of all issues discussed in this manual, with the
+exception of the âsedâ commands and command-line options.
+[index]
+* Menu:
+* -e, example:                           Overview.            (line  46)
+* -e, example <1>:                       sed script overview. (line  37)
+* âexpression, example:                  Overview.            (line  46)
+* -f, example:                           Overview.            (line  46)
+* -f, example <1>:                       sed script overview. (line  37)
+* âfile, example:                        Overview.            (line  46)
+* -i, example:                           Overview.            (line  26)
+* -n, example:                           Overview.            (line  33)
+* -s, example:                           Overview.            (line  40)
+* 0 address:                             Reporting Bugs.      (line 114)
+* ;, command separator:                  sed script overview. (line  37)
+* a, and semicolons:                     sed script overview. (line  56)
+* Additional reading about sed:          Other Resources.     (line  13)
+* ADDR1,+N:                              Range Addresses.     (line  31)
+* ADDR1,~N:                              Range Addresses.     (line  31)
+* address range, example:                sed script overview. (line  23)
+* Address, as a regular expression:      Regexp Addresses.    (line  13)
+* Address, last line:                    Numeric Addresses.   (line  13)
+* Address, numeric:                      Numeric Addresses.   (line   8)
+* addresses, excluding:                  Addresses overview.  (line  33)
+* Addresses, in sed scripts:             Numeric Addresses.   (line   6)
+* addresses, negating:                   Addresses overview.  (line  33)
+* addresses, numeric:                    Addresses overview.  (line   6)
+* addresses, range:                      Addresses overview.  (line  26)
+* addresses, regular expression:         Addresses overview.  (line  20)
+* addresses, syntax:                     sed script overview. (line  13)
+* alphabetic characters:                 Character Classes and Bracket Expressions.
+                                                              (line  49)
+* alphanumeric characters:               Character Classes and Bracket Expressions.
+                                                              (line  44)
+* Append hold space to pattern space:    Other Commands.      (line 288)
+* Append next input line to pattern space: Other Commands.    (line 261)
+* Append pattern space to hold space:    Other Commands.      (line 280)
+* Appending text after a line:           Other Commands.      (line  45)
+* b, joining lines with:                 Branching and flow control.
+                                                              (line 150)
+* b, versus t:                           Branching and flow control.
+                                                              (line 150)
+* back-reference:                        Back-references and Subexpressions.
+                                                              (line   6)
+* Backreferences, in regular expressions: The "s" Command.    (line  18)
+* blank characters:                      Character Classes and Bracket Expressions.
+                                                              (line  54)
+* bracket expression:                    Character Classes and Bracket Expressions.
+                                                              (line   6)
+* Branch to a label, if s/// failed:     Extended Commands.   (line  63)
+* Branch to a label, if s/// succeeded:  Programming Commands.
+                                                              (line  22)
+* Branch to a label, unconditionally:    Programming Commands.
+                                                              (line  18)
+* branching and n, N:                    Branching and flow control.
+                                                              (line 105)
+* branching, infinite loop:              Branching and flow control.
+                                                              (line  95)
+* branching, joining lines:              Branching and flow control.
+                                                              (line 150)
+* Buffer spaces, pattern and hold:       Execution Cycle.     (line   6)
+* Bugs, reporting:                       Reporting Bugs.      (line   6)
+* c, and semicolons:                     sed script overview. (line  56)
+* case insensitive, regular expression:  Regexp Addresses.    (line  47)
+* Case-insensitive matching:             The "s" Command.     (line 117)
+* Caveat â #n on first line:             Common Commands.     (line  20)
+* character class:                       Character Classes and Bracket Expressions.
+                                                              (line   6)
+* character classes:                     Character Classes and Bracket Expressions.
+                                                              (line  43)
+* classes of characters:                 Character Classes and Bracket Expressions.
+                                                              (line  43)
+* Command groups:                        Common Commands.     (line  91)
+* Comments, in scripts:                  Common Commands.     (line  12)
+* Conditional branch:                    Programming Commands.
+                                                              (line  22)
+* Conditional branch <1>:                Extended Commands.   (line  63)
+* control characters:                    Character Classes and Bracket Expressions.
+                                                              (line  57)
+* Copy hold space into pattern space:    Other Commands.      (line 284)
+* Copy pattern space into hold space:    Other Commands.      (line 276)
+* cycle, restarting:                     Branching and flow control.
+                                                              (line  75)
+* d, example:                            sed script overview. (line  23)
+* Delete first line from pattern space:  Other Commands.      (line 255)
+* digit characters:                      Character Classes and Bracket Expressions.
+                                                              (line  62)
+* Disabling autoprint, from command line: Command-Line Options.
+                                                              (line  23)
+* empty regular expression:              Regexp Addresses.    (line  22)
+* Emptying pattern space:                Extended Commands.   (line  85)
+* Emptying pattern space <1>:            Reporting Bugs.      (line 143)
+* Evaluate Bourne-shell commands:        Extended Commands.   (line  12)
+* Evaluate Bourne-shell commands, after substitution: The "s" Command.
+                                                              (line 108)
+* example, address range:                sed script overview. (line  23)
+* example, regular expression:           sed script overview. (line  28)
+* Exchange hold space with pattern space: Other Commands.     (line 292)
+* Excluding lines:                       Addresses overview.  (line  33)
+* exit status:                           Exit status.         (line   6)
+* exit status, example:                  Exit status.         (line  25)
+* Extended regular expressions, choosing: Command-Line Options.
+                                                              (line 135)
+* Extended regular expressions, syntax:  ERE syntax.          (line   6)
+* File name, printing:                   Extended Commands.   (line  30)
+* Files to be processed as input:        Command-Line Options.
+                                                              (line 181)
+* Flow of control in scripts:            Programming Commands.
+                                                              (line  11)
+* Global substitution:                   The "s" Command.     (line  74)
+* GNU extensions, /dev/stderr file:      The "s" Command.     (line 101)
+* GNU extensions, /dev/stderr file <1>:  Other Commands.      (line 244)
+* GNU extensions, /dev/stdin file:       Other Commands.      (line 227)
+* GNU extensions, /dev/stdin file <1>:   Extended Commands.   (line  53)
+* GNU extensions, /dev/stdout file:      Command-Line Options.
+                                                              (line 189)
+* GNU extensions, /dev/stdout file <1>:  The "s" Command.     (line 101)
+* GNU extensions, /dev/stdout file <2>:  Other Commands.      (line 244)
+* GNU extensions, 0 address:             Range Addresses.     (line  31)
+* GNU extensions, 0 address <1>:         Reporting Bugs.      (line 114)
+* GNU extensions, 0,ADDR2 addressing:    Range Addresses.     (line  31)
+* GNU extensions, ADDR1,+N addressing:   Range Addresses.     (line  31)
+* GNU extensions, ADDR1,~N addressing:   Range Addresses.     (line  31)
+* GNU extensions, branch if s/// failed: Extended Commands.   (line  63)
+* GNU extensions, case modifiers in s commands: The "s" Command.
+                                                              (line  29)
+* GNU extensions, checking for their presence: Extended Commands.
+                                                              (line  69)
+* GNU extensions, debug:                 Command-Line Options.
+                                                              (line  29)
+* GNU extensions, disabling:             Command-Line Options.
+                                                              (line 102)
+* GNU extensions, emptying pattern space: Extended Commands.  (line  85)
+* GNU extensions, emptying pattern space <1>: Reporting Bugs. (line 143)
+* GNU extensions, evaluating Bourne-shell commands: The "s" Command.
+                                                              (line 108)
+* GNU extensions, evaluating Bourne-shell commands <1>: Extended Commands.
+                                                              (line  12)
+* GNU extensions, extended regular expressions: Command-Line Options.
+                                                              (line 135)
+* GNU extensions, g and NUMBER modifier: The "s" Command.     (line  80)
+* GNU extensions, I modifier:            The "s" Command.     (line 117)
+* GNU extensions, I modifier <1>:        Regexp Addresses.    (line  47)
+* GNU extensions, in-place editing:      Command-Line Options.
+                                                              (line  56)
+* GNU extensions, in-place editing <1>:  Reporting Bugs.      (line  95)
+* GNU extensions, M modifier:            The "s" Command.     (line 122)
+* GNU extensions, M modifier <1>:        Regexp Addresses.    (line  75)
+* GNU extensions, modifiers and the empty regular expression: Regexp Addresses.
+                                                              (line  22)
+* GNU extensions, N~M addresses:         Numeric Addresses.   (line  18)
+* GNU extensions, quitting silently:     Extended Commands.   (line  36)
+* GNU extensions, R command:             Extended Commands.   (line  53)
+* GNU extensions, reading a file a line at a time: Extended Commands.
+                                                              (line  53)
+* GNU extensions, returning an exit code: Common Commands.    (line  28)
+* GNU extensions, returning an exit code <1>: Extended Commands.
+                                                              (line  36)
+* GNU extensions, setting line length:   Other Commands.      (line 207)
+* GNU extensions, special escapes:       Escapes.             (line   6)
+* GNU extensions, special escapes <1>:   Reporting Bugs.      (line  88)
+* GNU extensions, special two-address forms: Range Addresses. (line  31)
+* GNU extensions, subprocesses:          The "s" Command.     (line 108)
+* GNU extensions, subprocesses <1>:      Extended Commands.   (line  12)
+* GNU extensions, to basic regular expressions: BRE syntax.   (line  13)
+* GNU extensions, to basic regular expressions <1>: BRE syntax.
+                                                              (line  59)
+* GNU extensions, to basic regular expressions <2>: BRE syntax.
+                                                              (line  62)
+* GNU extensions, to basic regular expressions <3>: BRE syntax.
+                                                              (line  77)
+* GNU extensions, to basic regular expressions <4>: BRE syntax.
+                                                              (line  87)
+* GNU extensions, to basic regular expressions <5>: Reporting Bugs.
+                                                              (line  61)
+* GNU extensions, two addresses supported by most commands: Other Commands.
+                                                              (line  61)
+* GNU extensions, two addresses supported by most commands <1>: Other Commands.
+                                                              (line 115)
+* GNU extensions, two addresses supported by most commands <2>: Other Commands.
+                                                              (line 204)
+* GNU extensions, two addresses supported by most commands <3>: Other Commands.
+                                                              (line 236)
+* GNU extensions, unlimited line length: Limitations.         (line   6)
+* GNU extensions, writing first line to a file: Extended Commands.
+                                                              (line  80)
+* Goto, in scripts:                      Programming Commands.
+                                                              (line  18)
+* graphic characters:                    Character Classes and Bracket Expressions.
+                                                              (line  65)
+* Greedy regular expression matching:    BRE syntax.          (line 113)
+* Grouping commands:                     Common Commands.     (line  91)
+* hexadecimal digits:                    Character Classes and Bracket Expressions.
+                                                              (line  88)
+* Hold space, appending from pattern space: Other Commands.   (line 280)
+* Hold space, appending to pattern space: Other Commands.     (line 288)
+* Hold space, copy into pattern space:   Other Commands.      (line 284)
+* Hold space, copying pattern space into: Other Commands.     (line 276)
+* Hold space, definition:                Execution Cycle.     (line   6)
+* Hold space, exchange with pattern space: Other Commands.    (line 292)
+* i, and semicolons:                     sed script overview. (line  56)
+* In-place editing:                      Reporting Bugs.      (line  95)
+* In-place editing, activating:          Command-Line Options.
+                                                              (line  56)
+* In-place editing, Perl-style backup file names: Command-Line Options.
+                                                              (line  67)
+* infinite loop, branching:              Branching and flow control.
+                                                              (line  95)
+* Inserting text before a line:          Other Commands.      (line 104)
+* joining lines with branching:          Branching and flow control.
+                                                              (line 150)
+* joining quoted-printable lines:        Branching and flow control.
+                                                              (line 150)
+* labels:                                Branching and flow control.
+                                                              (line  75)
+* Labels, in scripts:                    Programming Commands.
+                                                              (line  14)
+* Last line, selecting:                  Numeric Addresses.   (line  13)
+* Line length, setting:                  Command-Line Options.
+                                                              (line  97)
+* Line length, setting <1>:              Other Commands.      (line 207)
+* Line number, printing:                 Other Commands.      (line 194)
+* Line selection:                        Numeric Addresses.   (line   6)
+* Line, selecting by number:             Numeric Addresses.   (line   8)
+* Line, selecting by regular expression match: Regexp Addresses.
+                                                              (line  13)
+* Line, selecting last:                  Numeric Addresses.   (line  13)
+* List pattern space:                    Other Commands.      (line 207)
+* lower-case letters:                    Character Classes and Bracket Expressions.
+                                                              (line  68)
+* Mixing g and NUMBER modifiers in the s command: The "s" Command.
+                                                              (line  80)
+* multiple files:                        Overview.            (line  40)
+* multiple sed commands:                 sed script overview. (line  37)
+* n, and branching:                      Branching and flow control.
+                                                              (line 105)
+* N, and branching:                      Branching and flow control.
+                                                              (line 105)
+* named character classes:               Character Classes and Bracket Expressions.
+                                                              (line  43)
+* newline, command separator:            sed script overview. (line  37)
+* Next input line, append to pattern space: Other Commands.   (line 261)
+* Next input line, replace pattern space with: Common Commands.
+                                                              (line  61)
+* Non-bugs, 0 address:                   Reporting Bugs.      (line 114)
+* Non-bugs, in-place editing:            Reporting Bugs.      (line  95)
+* Non-bugs, localization-related:        Reporting Bugs.      (line 124)
+* Non-bugs, localization-related <1>:    Reporting Bugs.      (line 143)
+* Non-bugs, N command on the last line:  Reporting Bugs.      (line  30)
+* Non-bugs, regex syntax clashes:        Reporting Bugs.      (line  61)
+* numeric addresses:                     Addresses overview.  (line   6)
+* numeric characters:                    Character Classes and Bracket Expressions.
+                                                              (line  62)
+* omitting labels:                       Branching and flow control.
+                                                              (line  75)
+* output:                                Overview.            (line  26)
+* output, suppressing:                   Overview.            (line  33)
+* p, example:                            Overview.            (line  33)
+* paragraphs, processing:                Multiline techniques.
+                                                              (line  53)
+* parameters, script:                    Overview.            (line  46)
+* Parenthesized substrings:              The "s" Command.     (line  18)
+* Pattern space, definition:             Execution Cycle.     (line   6)
+* Portability, comments:                 Common Commands.     (line  15)
+* Portability, line length limitations:  Limitations.         (line   6)
+* Portability, N command on the last line: Reporting Bugs.    (line  30)
+* POSIXLY_CORRECT behavior, bracket expressions: Character Classes and Bracket Expressions.
+                                                              (line 112)
+* POSIXLY_CORRECT behavior, enabling:    Command-Line Options.
+                                                              (line 105)
+* POSIXLY_CORRECT behavior, escapes:     Escapes.             (line  11)
+* POSIXLY_CORRECT behavior, N command:   Reporting Bugs.      (line  56)
+* Print first line from pattern space:   Other Commands.      (line 273)
+* printable characters:                  Character Classes and Bracket Expressions.
+                                                              (line  72)
+* Printing file name:                    Extended Commands.   (line  30)
+* Printing line number:                  Other Commands.      (line 194)
+* Printing text unambiguously:           Other Commands.      (line 207)
+* processing paragraphs:                 Multiline techniques.
+                                                              (line  53)
+* punctuation characters:                Character Classes and Bracket Expressions.
+                                                              (line  75)
+* Q, example:                            Exit status.         (line  25)
+* q, example:                            sed script overview. (line  28)
+* Quitting:                              Common Commands.     (line  28)
+* Quitting <1>:                          Extended Commands.   (line  36)
+* quoted-printable lines, joining:       Branching and flow control.
+                                                              (line 150)
+* range addresses:                       Addresses overview.  (line  26)
+* range expression:                      Character Classes and Bracket Expressions.
+                                                              (line  18)
+* Range of lines:                        Range Addresses.     (line   6)
+* Range with start address of zero:      Range Addresses.     (line  31)
+* Read next input line:                  Common Commands.     (line  61)
+* Read text from a file:                 Other Commands.      (line 219)
+* Read text from a file <1>:             Extended Commands.   (line  53)
+* regex addresses and input lines:       Regexp Addresses.    (line  84)
+* regex addresses and pattern space:     Regexp Addresses.    (line  84)
+* regular expression addresses:          Addresses overview.  (line  20)
+* regular expression, example:           sed script overview. (line  28)
+* Replace hold space with copy of pattern space: Other Commands.
+                                                              (line 276)
+* Replace pattern space with copy of hold space: Other Commands.
+                                                              (line 284)
+* Replacing all text matching regexp in a line: The "s" Command.
+                                                              (line  74)
+* Replacing only Nth match of regexp in a line: The "s" Command.
+                                                              (line  78)
+* Replacing selected lines with other text: Other Commands.   (line 157)
+* Requiring GNU sed:                     Extended Commands.   (line  69)
+* restarting a cycle:                    Branching and flow control.
+                                                              (line  75)
+* Sandbox mode:                          Command-Line Options.
+                                                              (line 157)
+* script parameter:                      Overview.            (line  46)
+* Script structure:                      sed script overview. (line   6)
+* Script, from a file:                   Command-Line Options.
+                                                              (line  51)
+* Script, from command line:             Command-Line Options.
+                                                              (line  46)
+* sed commands syntax:                   sed script overview. (line  13)
+* sed commands, multiple:                sed script overview. (line  37)
+* sed script structure:                  sed script overview. (line   6)
+* Selecting lines to process:            Numeric Addresses.   (line   6)
+* Selecting non-matching lines:          Addresses overview.  (line  33)
+* semicolons, command separator:         sed script overview. (line  37)
+* Several lines, selecting:              Range Addresses.     (line   6)
+* Slash character, in regular expressions: Regexp Addresses.  (line  32)
+* space characters:                      Character Classes and Bracket Expressions.
+                                                              (line  80)
+* Spaces, pattern and hold:              Execution Cycle.     (line   6)
+* Special addressing forms:              Range Addresses.     (line  31)
+* standard input:                        Overview.            (line  18)
+* Standard input, processing as input:   Command-Line Options.
+                                                              (line 183)
+* standard output:                       Overview.            (line  26)
+* stdin:                                 Overview.            (line  18)
+* stdout:                                Overview.            (line  26)
+* Stream editor:                         Introduction.        (line   6)
+* subexpression:                         Back-references and Subexpressions.
+                                                              (line   6)
+* Subprocesses:                          The "s" Command.     (line 108)
+* Subprocesses <1>:                      Extended Commands.   (line  12)
+* Substitution of text, options:         The "s" Command.     (line  70)
+* suppressing output:                    Overview.            (line  33)
+* syntax, addresses:                     sed script overview. (line  13)
+* syntax, sed commands:                  sed script overview. (line  13)
+* t, joining lines with:                 Branching and flow control.
+                                                              (line 150)
+* t, versus b:                           Branching and flow control.
+                                                              (line 150)
+* Text, appending:                       Other Commands.      (line  45)
+* Text, deleting:                        Common Commands.     (line  44)
+* Text, insertion:                       Other Commands.      (line 104)
+* Text, printing:                        Common Commands.     (line  52)
+* Text, printing after substitution:     The "s" Command.     (line  88)
+* Text, writing to a file after substitution: The "s" Command.
+                                                              (line 101)
+* Transliteration:                       Other Commands.      (line  11)
+* Unbuffered I/O, choosing:              Command-Line Options.
+                                                              (line 164)
+* upper-case letters:                    Character Classes and Bracket Expressions.
+                                                              (line  84)
+* Usage summary, printing:               Command-Line Options.
+                                                              (line  17)
+* Version, printing:                     Command-Line Options.
+                                                              (line  13)
+* whitespace characters:                 Character Classes and Bracket Expressions.
+                                                              (line  80)
+* Working on separate files:             Command-Line Options.
+                                                              (line 148)
+* Write first line to a file:            Extended Commands.   (line  80)
+* Write to a file:                       Other Commands.      (line 244)
+* xdigit class:                          Character Classes and Bracket Expressions.
+                                                              (line  88)
+* Zero Address:                          Zero Address.        (line   6)
+* Zero, as range start address:          Range Addresses.     (line  31)
+File: sed.info,  Node: Command and Option Index,  Prev: Concept Index,  Up: Top
+Command and Option Index
+************************
+This is an alphabetical list of all âsedâ commands and command-line
+options.
+[index]
+* Menu:
+* # (comments):                          Common Commands.     (line  12)
+* --binary:                              Command-Line Options.
+                                                              (line 114)
+* --debug:                               Command-Line Options.
+                                                              (line  29)
+* --expression:                          Command-Line Options.
+                                                              (line  46)
+* --file:                                Command-Line Options.
+                                                              (line  51)
+* --follow-symlinks:                     Command-Line Options.
+                                                              (line 125)
+* --help:                                Command-Line Options.
+                                                              (line  17)
+* --in-place:                            Command-Line Options.
+                                                              (line  56)
+* --line-length:                         Command-Line Options.
+                                                              (line  97)
+* --null-data:                           Command-Line Options.
+                                                              (line 172)
+* --posix:                               Command-Line Options.
+                                                              (line 102)
+* --quiet:                               Command-Line Options.
+                                                              (line  23)
+* --regexp-extended:                     Command-Line Options.
+                                                              (line 135)
+* --sandbox:                             Command-Line Options.
+                                                              (line 157)
+* --separate:                            Command-Line Options.
+                                                              (line 148)
+* --silent:                              Command-Line Options.
+                                                              (line  23)
+* --unbuffered:                          Command-Line Options.
+                                                              (line 164)
+* --version:                             Command-Line Options.
+                                                              (line  13)
+* --zero-terminated:                     Command-Line Options.
+                                                              (line 172)
+* -b:                                    Command-Line Options.
+                                                              (line 114)
+* -e:                                    Command-Line Options.
+                                                              (line  46)
+* -E:                                    Command-Line Options.
+                                                              (line 135)
+* -f:                                    Command-Line Options.
+                                                              (line  51)
+* -i:                                    Command-Line Options.
+                                                              (line  56)
+* -l:                                    Command-Line Options.
+                                                              (line  97)
+* -n:                                    Command-Line Options.
+                                                              (line  23)
+* -n, forcing from within a script:      Common Commands.     (line  20)
+* -r:                                    Command-Line Options.
+                                                              (line 135)
+* -s:                                    Command-Line Options.
+                                                              (line 148)
+* -u:                                    Command-Line Options.
+                                                              (line 164)
+* -z:                                    Command-Line Options.
+                                                              (line 172)
+* : (label) command:                     Programming Commands.
+                                                              (line  14)
+* = (print line number) command:         Other Commands.      (line 194)
+* {} command grouping:                   Common Commands.     (line  91)
+* a (append text lines) command:         Other Commands.      (line  45)
+* alnum character class:                 Character Classes and Bracket Expressions.
+                                                              (line  44)
+* alpha character class:                 Character Classes and Bracket Expressions.
+                                                              (line  49)
+* b (branch) command:                    Programming Commands.
+                                                              (line  18)
+* blank character class:                 Character Classes and Bracket Expressions.
+                                                              (line  54)
+* c (change to text lines) command:      Other Commands.      (line 157)
+* cntrl character class:                 Character Classes and Bracket Expressions.
+                                                              (line  57)
+* D (delete first line) command:         Other Commands.      (line 255)
+* d (delete) command:                    Common Commands.     (line  44)
+* digit character class:                 Character Classes and Bracket Expressions.
+                                                              (line  62)
+* e (evaluate) command:                  Extended Commands.   (line  12)
+* F (File name) command:                 Extended Commands.   (line  30)
+* G (appending Get) command:             Other Commands.      (line 288)
+* g (get) command:                       Other Commands.      (line 284)
+* graph character class:                 Character Classes and Bracket Expressions.
+                                                              (line  65)
+* H (append Hold) command:               Other Commands.      (line 280)
+* h (hold) command:                      Other Commands.      (line 276)
+* i (insert text lines) command:         Other Commands.      (line 104)
+* l (list unambiguously) command:        Other Commands.      (line 207)
+* lower character class:                 Character Classes and Bracket Expressions.
+                                                              (line  68)
+* N (append Next line) command:          Other Commands.      (line 261)
+* n (next-line) command:                 Common Commands.     (line  61)
+* P (print first line) command:          Other Commands.      (line 273)
+* p (print) command:                     Common Commands.     (line  52)
+* print character class:                 Character Classes and Bracket Expressions.
+                                                              (line  72)
+* punct character class:                 Character Classes and Bracket Expressions.
+                                                              (line  75)
+* q (quit) command:                      Common Commands.     (line  28)
+* Q (silent Quit) command:               Extended Commands.   (line  36)
+* r (read file) command:                 Other Commands.      (line 219)
+* R (read line) command:                 Extended Commands.   (line  53)
+* s command, option flags:               The "s" Command.     (line  70)
+* space character class:                 Character Classes and Bracket Expressions.
+                                                              (line  80)
+* T (test and branch if failed) command: Extended Commands.   (line  63)
+* t (test and branch if successful) command: Programming Commands.
+                                                              (line  22)
+* upper character class:                 Character Classes and Bracket Expressions.
+                                                              (line  84)
+* v (version) command:                   Extended Commands.   (line  69)
+* w (write file) command:                Other Commands.      (line 244)
+* W (write first line) command:          Extended Commands.   (line  80)
+* x (eXchange) command:                  Other Commands.      (line 292)
+* xdigit character class:                Character Classes and Bracket Expressions.
+                                                              (line  88)
+* y (transliterate) command:             Other Commands.      (line  11)
+* z (Zap) command:                       Extended Commands.   (line  85)
 Tag Table:
+(Indirect)
+Node: Top935
+Node: Introduction3816
+Node: Invoking sed4370
+Ref: Invoking sed-Footnote-19396
+Ref: Invoking sed-Footnote-29588
+Node: sed Programs9691
+Node: Execution Cycle10838
+Ref: Execution Cycle-Footnote-112011
+Node: Addresses12322
+Node: Regular Expressions17061
+Node: Common Commands24610
+Node: The "s" Command26608
+Ref: The "s" Command-Footnote-130940
+Node: Other Commands31012
+Ref: Other Commands-Footnote-136149
+Node: Programming Commands36221
+Node: Extended Commands37130
+Node: Escapes40705
+Ref: Escapes-Footnote-143711
+Node: Examples43902
+Node: Centering lines44997
+Node: Increment a number45909
+Ref: Increment a number-Footnote-147489
+Node: Rename files to lower case47609
+Node: Print bash environment50405
+Node: Reverse chars of lines51185
+Ref: Reverse chars of lines-Footnote-152202
+Node: tac52424
+Node: cat -n53206
+Node: cat -b55063
+Node: wc -c55815
+Ref: wc -c-Footnote-157748
+Node: wc -w57817
+Node: wc -l59289
+Node: head59526
+Node: tail59850
+Node: uniq61534
+Node: uniq -d62330
+Node: uniq -u63054
+Node: cat -s63778
+Node: Limitations65667
+Node: Other Resources66507
+Node: Reporting Bugs67433
+Ref: Reporting Bugs-Footnote-173962
+Node: Extended regexps74033
+Node: Concept Index75200
+Node: Command and Option Index85215
+Node: Top738
+Node: Introduction2217
+Node: Invoking sed2789
+Node: Overview3114
+Node: Command-Line Options5561
+Ref: Command-Line Options-Footnote-113530
+Ref: Command-Line Options-Footnote-213758
+Node: Exit status13861
+Node: sed scripts14795
+Node: sed script overview15394
+Node: sed commands list18057
+Node: The "s" Command23070
+Ref: The "s" Command-Footnote-128889
+Node: Common Commands28969
+Node: Other Commands32106
+Ref: insert command35324
+Ref: Other Commands-Footnote-141629
+Node: Programming Commands41709
+Node: Extended Commands42649
+Node: Multiple commands syntax46675
+Node: sed addresses51217
+Node: Addresses overview51706
+Node: Numeric Addresses53705
+Node: Regexp Addresses55116
+Ref: Regexp Addresses-Footnote-159252
+Node: Range Addresses59392
+Ref: Zero Address Regex Range60294
+Node: Zero Address61753
+Node: sed regular expressions62318
+Node: Regular Expressions Overview63172
+Node: BRE vs ERE64733
+Node: BRE syntax66484
+Node: ERE syntax73304
+Node: Character Classes and Bracket Expressions74878
+Node: regexp extensions80030
+Node: Back-references and Subexpressions82506
+Node: Escapes84958
+Ref: Escapes-Footnote-188105
+Node: Locale Considerations88304
+Ref: Locale Considerations-Footnote-193067
+Node: advanced sed93239
+Node: Execution Cycle93606
+Ref: Execution Cycle-Footnote-194845
+Node: Hold and Pattern Buffers95162
+Node: Multiline techniques95350
+Node: Branching and flow control98704
+Node: Examples107029
+Node: Joining lines108275
+Node: Centering lines110082
+Node: Increment a number111006
+Ref: Increment a number-Footnote-1112495
+Node: Rename files to lower case112623
+Node: Print bash environment115418
+Node: Reverse chars of lines116181
+Ref: Reverse chars of lines-Footnote-1117224
+Node: Text search across multiple lines117441
+Node: Line length adjustment120786
+Node: Adding a header to multiple files122533
+Node: tac125986
+Node: cat -n126774
+Node: cat -b128616
+Node: wc -c129378
+Ref: wc -c-Footnote-1131316
+Node: wc -w131385
+Node: wc -l132875
+Node: head133128
+Node: tail133467
+Node: uniq135196
+Node: uniq -d136007
+Node: uniq -u136722
+Node: cat -s137435
+Node: Limitations139298
+Node: Other Resources140161
+Node: Reporting Bugs141104
+Ref: N_command_last_line142294
+Ref: Reporting Bugs-Footnote-1148805
+Node: GNU Free Documentation License148880
+Node: Concept Index174239
+Node: Command and Option Index201600
 End Tag Table
+Local Variables:
+coding: utf-8
+End:

trunk/src/sed/doc/sed.texi

-              r599
+              r3613
 \input texinfo  @c -*-texinfo-*-
-@c Do not edit this file!! It is automatically generated from sed-in.texi.
 @c
 @c -- Stuff that needs adding: ----------------------------------------------
 @c (document the `;' command-separator)
+@c (nothing!)
 @c --------------------------------------------------------------------------
 @c Check for consistency: regexps in @code, text that they match in @samp.
 @c
+@c
 @c Tips:
 @c    @command for command
 …
 @value{SSED}, a stream editor.
+Copyright @copyright{} 1998, 1999, 2001, 2002, 2003, 2004 Free
+Software Foundation, Inc.
+This document is released under the terms of the @acronym{GNU} Free
+Documentation License as published by the Free Software Foundation;
+either version 1.1, or (at your option) any later version.
+You should have received a copy of the @acronym{GNU} Free Documentation
+License along with @value{SSED}; see the file @file{COPYING.DOC}.
+If not, write to the Free Software Foundation, 59 Temple Place - Suite
+, Boston, MA 02110-1301, USA.
+There are no Cover Texts and no Invariant Sections; this text, along
+with its equivalent in the printed manual, constitutes the Title Page.
+Copyright @copyright{} 1998--2022 Free Software Foundation, Inc.
+@quotation
+Permission is granted to copy, distribute and/or modify this document
+under the terms of the GNU Free Documentation License, Version 1.3
+or any later version published by the Free Software Foundation;
+with no Invariant Sections, no Front-Cover Texts, and no
+Back-Cover Texts.  A copy of the license is included in the
+section entitled ``GNU Free Documentation License''.
+@end quotation
 @end copying
 …
 @titlepage
 @title @command{sed}, a stream editor
+@title @value{SSED}, a stream editor
 @subtitle version @value{VERSION}, @value{UPDATED}
 @author by Ken Pizzini, Paolo Bonzini
+@author by Ken Pizzini, Paolo Bonzini, Jim Meyering, Assaf Gordon
 @page
 @vskip 0pt plus 1filll
-Copyright @copyright{} 1998, 1999 Free Software Foundation, Inc.
 @insertcopying
-Published by the Free Software Foundation, @*
-Franklin Street, Fifth Floor @*
-Boston, MA 02110-1301, USA
 @end titlepage
+@contents
+@ifnottex
 @node Top
+@top
+@ifnottex
+@top @value{SSED}
 @insertcopying
 @end ifnottex
 …
 * Introduction::               Introduction
 * Invoking sed::               Invocation
+* sed Programs::               @command{sed} programs
+* sed scripts::                @command{sed} scripts
+* sed addresses::              Addresses: selecting lines
+* sed regular expressions::    Regular expressions: selecting text
+* advanced sed::               Advanced @command{sed}: cycles and buffers
 * Examples::                   Some sample scripts
 * Limitations::                Limitations and (non-)limitations of @value{SSED}
 * Other Resources::            Other resources for learning about @command{sed}
 * Reporting Bugs::             Reporting bugs
+* Extended regexps::           @command{egrep}-style regular expressions
+@ifset PERL
+* Perl regexps::               Perl-style regular expressions
+@end ifset
+* GNU Free Documentation License:: Copying and sharing this manual
 * Concept Index::              A menu with all the topics in this manual.
 * Command and Option Index::   A menu with all @command{sed} commands and
                                command-line options.
-@detailmenu
---- The detailed node listing ---
-sed Programs:
-* Execution Cycle::                 How @command{sed} works
-* Addresses::                       Selecting lines with @command{sed}
-* Regular Expressions::             Overview of regular expression syntax
-* Common Commands::                 Often used commands
-* The "s" Command::                 @command{sed}'s Swiss Army Knife
-* Other Commands::                  Less frequently used commands
-* Programming Commands::            Commands for @command{sed} gurus
-* Extended Commands::               Commands specific of @value{SSED}
-* Escapes::                         Specifying special characters
-Examples:
-* Centering lines::
-* Increment a number::
-* Rename files to lower case::
-* Print bash environment::
-* Reverse chars of lines::
-* tac::                             Reverse lines of files
-* cat -n::                          Numbering lines
-* cat -b::                          Numbering non-blank lines
-* wc -c::                           Counting chars
-* wc -w::                           Counting words
-* wc -l::                           Counting lines
-* head::                            Printing the first lines
-* tail::                            Printing the last lines
-* uniq::                            Make duplicate lines unique
-* uniq -d::                         Print duplicated lines of input
-* uniq -u::                         Remove all duplicated lines
-* cat -s::                          Squeezing blank lines
-@ifset PERL
-Perl regexps::                      Perl-style regular expressions
-* Backslash::                       Introduces special sequences
-* Circumflex/dollar sign/period::   Behave specially with regard to new lines
-* Square brackets::                 Are a bit different in strange cases
-* Options setting::                 Toggle modifiers in the middle of a regexp
-* Non-capturing subpatterns::       Are not counted when backreferencing
-* Repetition::                      Allows for non-greedy matching
-* Backreferences::                  Allows for more than 10 back references
-* Assertions::                      Allows for complex look ahead matches
-* Non-backtracking subpatterns::    Often gives more performance
-* Conditional subpatterns::         Allows if/then/else branches
-* Recursive patterns::              For example to match parentheses
-* Comments::                        Because things can get complex...
-@end ifset
-@end detailmenu
 @end menu
 …
 @node Invoking sed
+@chapter Invocation
+@chapter Running sed
+This chapter covers how to run @command{sed}. Details of @command{sed}
+scripts and individual @command{sed} commands are discussed in the
+next chapter.
+@menu
+* Overview::
+* Command-Line Options::
+* Exit status::
+@end menu
+@node Overview
+@section Overview
 Normally @command{sed} is invoked like this:
 …
 @end example
+For example, to change every @samp{hello} to @samp{world}
+in the file @file{input.txt}:
+@example
+sed 's/hello/world/g' input.txt > output.txt
+@end example
+Without the @samp{g} (global) modifier, @command{sed} affects
+only the first instance per line.
+@cindex stdin
+@cindex standard input
+If you do not specify @var{INPUTFILE}, or if @var{INPUTFILE} is @file{-},
+@command{sed} filters the contents of the standard input. The following
+commands are equivalent:
+@example
+sed 's/hello/world/g' input.txt > output.txt
+sed 's/hello/world/g' < input.txt > output.txt
+cat input.txt | sed 's/hello/world/g' - > output.txt
+@end example
+@cindex stdout
+@cindex output
+@cindex standard output
+@cindex -i, example
+@command{sed} writes output to standard output. Use @option{-i} to edit
+files in-place instead of printing to standard output.
+See also the @code{W} and @code{s///w} commands for writing output to
+other files. The following command modifies @file{file.txt} and
+does not produce any output:
+@example
+sed -i 's/hello/world/' file.txt
+@end example
+@cindex -n, example
+@cindex p, example
+@cindex suppressing output
+@cindex output, suppressing
+By default @command{sed} prints all processed input (except input
+that has been modified/deleted by commands such as @command{d}).
+Use @option{-n} to suppress output, and the @code{p} command
+to print specific lines. The following command prints only line 45
+of the input file:
+@example
+sed -n '45p' file.txt
+@end example
+@cindex multiple files
+@cindex -s, example
+@command{sed} treats multiple input files as one long stream.
+The following example prints the first line of the first file
+(@file{one.txt}) and the last line of the last file (@file{three.txt}).
+Use @option{-s} to reverse this behavior.
+@example
+sed -n  '1p ; $p' one.txt two.txt three.txt
+@end example
+@cindex -e, example
+@cindex --expression, example
+@cindex -f, example
+@cindex --file, example
+@cindex script parameter
+@cindex parameters, script
+Without @option{-e} or @option{-f} options, @command{sed} uses
+the first non-option parameter as the @var{script}, and the following
+non-option parameters as input files.
+If @option{-e} or @option{-f} options are used to specify a @var{script},
+all non-option parameters are taken as input files.
+Options @option{-e} and @option{-f} can be combined, and can appear
+multiple times (in which case the final effective @var{script} will be
+concatenation of all the individual @var{script}s).
+The following examples are equivalent:
+@example
+sed 's/hello/world/' input.txt > output.txt
+sed -e 's/hello/world/' input.txt > output.txt
+sed --expression='s/hello/world/' input.txt > output.txt
+echo 's/hello/world/' > myscript.sed
+sed -f myscript.sed input.txt > output.txt
+sed --file=myscript.sed input.txt > output.txt
+@end example
+@node Command-Line Options
+@section Command-Line Options
 The full format for invoking @command{sed} is:
 …
 sed OPTIONS... [SCRIPT] [INPUTFILE...]
 @end example
-If you do not specify @var{INPUTFILE}, or if @var{INPUTFILE} is @file{-},
-@command{sed} filters the contents of the standard input.  The @var{script}
-is actually the first non-option parameter, which @command{sed} specially
-considers a script and not an input file if (and only if) none of the
-other @var{options} specifies a script to be executed, that is if neither
-of the @option{-e} and @option{-f} options is specified.
 @command{sed} may be invoked with the following command-line options:
 …
 @cindex Disabling autoprint, from command line
 By default, @command{sed} prints out the pattern space
+at the end of each cycle through the script.
+at the end of each cycle through the script (@pxref{Execution Cycle, ,
+How @code{sed} works}).
 These options disable this automatic printing,
 and @command{sed} only produces output when explicitly told to
 via the @code{p} command.
+@item --debug
+@opindex --debug
+@cindex @value{SSEDEXT}, debug
+Print the input sed program in canonical form,
+and annotate program execution.
+@codequotebacktick on
+@codequoteundirected on
+@example
+$ echo 1 | sed '\%1%s21232'
+$ echo 1 | sed --debug '\%1%s21232'
+SED PROGRAM:
+  /1/ s/1/3/
+INPUT:   'STDIN' line 1
+PATTERN: 1
+COMMAND: /1/ s/1/3/
+PATTERN: 3
+END-OF-CYCLE:
+@end example
+@codequotebacktick off
+@codequoteundirected off
+@item -e @var{script}
+@itemx --expression=@var{script}
+@opindex -e
+@opindex --expression
+@cindex Script, from command line
+Add the commands in @var{script} to the set of commands to be
+run while processing the input.
+@item -f @var{script-file}
+@itemx --file=@var{script-file}
+@opindex -f
+@opindex --file
+@cindex Script, from a file
+Add the commands contained in the file @var{script-file}
+to the set of commands to be run while processing the input.
 @item -i[@var{SUFFIX}]
 …
 before renaming the temporary file, thereby making a backup
 copy@footnote{Note that @value{SSED} creates the backup
     file whether or not any output is actually changed.}).
+file whether or not any output is actually changed.}).
 @cindex In-place editing, Perl-style backup file names
 …
 overwritten without making a backup.
+Because @option{-i} takes an optional argument, it should
+not be followed by other short options:
+@table @code
+@item sed -Ei '...' FILE
+Same as @option{-E -i} with no backup suffix - @file{FILE} will be
+edited in-place without creating a backup.
+@item sed -iE '...' FILE
+This is equivalent to @option{--in-place=E}, creating @file{FILEE} as backup
+of @file{FILE}
+@end table
+Be cautious of using @option{-n} with @option{-i}: the former disables
+automatic printing of lines and the latter changes the file in-place
+without a backup. Used carelessly (and without an explicit @code{p} command),
+the output file will be empty:
+@codequotebacktick on
+@codequoteundirected on
+@example
+# WRONG USAGE: 'FILE' will be truncated.
+sed -ni 's/foo/bar/' FILE
+@end example
+@codequotebacktick off
+@codequoteundirected off
 @item -l @var{N}
 @itemx --line-length=@var{N}
 …
 @item --posix
+@opindex --posix
 @cindex @value{SSEDEXT}, disabling
 @value{SSED} includes several extensions to @acronym{POSIX}
+@value{SSED} includes several extensions to POSIX
 sed.  In order to simplify writing portable scripts, this
 option disables all the extensions that this manual documents,
 …
 @cindex @code{POSIXLY_CORRECT} behavior, enabling
 Most of the extensions accept @command{sed} programs that
 are outside the syntax mandated by @acronym{POSIX}, but some
+are outside the syntax mandated by POSIX, but some
 of them (such as the behavior of the @command{N} command
 described in @pxref{Reporting Bugs}) actually violate the
+described in @ref{Reporting Bugs}) actually violate the
 standard.  If you want to disable only the latter kind of
 extension, you can set the @code{POSIXLY_CORRECT} variable
 to a non-empty value.
+@item -r
+@item -b
+@itemx --binary
+@opindex -b
+@opindex --binary
+This option is available on every platform, but is only effective where the
+operating system makes a distinction between text files and binary files.
+When such a distinction is made---as is the case for MS-DOS, Windows,
+Cygwin---text files are composed of lines separated by a carriage return
+@emph{and} a line feed character, and @command{sed} does not see the
+ending CR.  When this option is specified, @command{sed} will open
+input files in binary mode, thus not requesting this special processing
+and considering lines to end at a line feed.
+@item --follow-symlinks
+@opindex --follow-symlinks
+This option is available only on platforms that support
+symbolic links and has an effect only if option @option{-i}
+is specified.  In this case, if the file that is specified
+on the command line is a symbolic link, @command{sed} will
+follow the link and edit the ultimate destination of the
+link.  The default behavior is to break the symbolic link,
+so that the link destination will not be modified.
+@item -E
+@itemx -r
 @itemx --regexp-extended
+@opindex -E
 @opindex -r
 @opindex --regexp-extended
 @cindex Extended regular expressions, choosing
 @cindex @acronym{GNU} extensions, extended regular expressions
+@cindex GNU extensions, extended regular expressions
 Use extended regular expressions rather than basic
 regular expressions.  Extended regexps are those that
 @command{egrep} accepts; they can be clearer because they
+usually have less backslashes, but are a @acronym{GNU} extension
+and hence scripts that use them are not portable.
+@xref{Extended regexps, , Extended regular expressions}.
+@ifset PERL
+@item -R
+@itemx --regexp-perl
+@opindex -R
+@opindex --regexp-perl
+@cindex Perl-style regular expressions, choosing
+@cindex @value{SSEDEXT}, Perl-style regular expressions
+Use Perl-style regular expressions rather than basic
+regular expressions.  Perl-style regexps are extremely
+powerful but are a @value{SSED} extension and hence scripts that
+use it are not portable.  @xref{Perl regexps, ,
+Perl-style regular expressions}.
+@end ifset
+usually have fewer backslashes.
+Historically this was a GNU extension,
+but the @option{-E}
+extension has since been added to the POSIX standard
+(http://austingroupbugs.net/view.php?id=528),
+so use @option{-E} for portability.
+GNU sed has accepted @option{-E} as an undocumented option for years,
+and *BSD seds have accepted @option{-E} for years as well,
+but scripts that use @option{-E} might not port to other older systems.
+@xref{ERE syntax, , Extended regular expressions}.
 @item -s
 @itemx --separate
+@opindex -s
+@opindex --separate
 @cindex Working on separate files
 By default, @command{sed} will consider the files specified on the
 …
 start of each file.
+@item --sandbox
+@opindex --sandbox
+@cindex Sandbox mode
+In sandbox mode,  @code{e/w/r} commands are rejected - programs containing
+them will be aborted without being run. Sandbox mode ensures @command{sed}
+operates only on the input files designated on the command line, and
+cannot run external programs.
 @item -u
 @itemx --unbuffered
 …
 output as soon as possible.)
+@item -e @var{script}
+@itemx --expression=@var{script}
+@opindex -e
+@opindex --expression
+@cindex Script, from command line
+Add the commands in @var{script} to the set of commands to be
+run while processing the input.
+@item -f @var{script-file}
+@itemx --file=@var{script-file}
+@opindex -f
+@opindex --file
+@cindex Script, from a file
+Add the commands contained in the file @var{script-file}
+to the set of commands to be run while processing the input.
+@item -z
+@itemx --null-data
+@itemx --zero-terminated
+@opindex -z
+@opindex --null-data
+@opindex --zero-terminated
+Treat the input as a set of lines, each terminated by a zero byte
+(the ASCII @samp{NUL} character) instead of a newline.  This option can
+be used with commands like @samp{sort -z} and @samp{find -print0}
+to process arbitrary file names.
 @end table
 …
 The standard input will be processed if no file names are specified.
+@node sed Programs
+@chapter @command{sed} Programs
+@cindex @command{sed} program structure
+@node Exit status
+@section Exit status
+@cindex exit status
+An exit status of zero indicates success, and a nonzero value
+indicates failure. @value{SSED} returns the following exit status
+error values:
+@table @asis
+@item 0
+Successful completion.
+@item 1
+Invalid command, invalid syntax, invalid regular expression or a
+@value{SSED} extension command used with @option{--posix}.
+@item 2
+One or more of the input file specified on the command line could not be
+opened (e.g. if a file is not found, or read permission is denied).
+Processing continued with other files.
+@item 4
+An I/O error, or a serious processing error during runtime,
+@value{SSED} aborted immediately.
+@end table
+@cindex Q, example
+@cindex exit status, example
+Additionally, the commands @code{q} and @code{Q} can be used to terminate
+@command{sed} with a custom exit code value (this is a @value{SSED} extension):
+@example
+$ echo | sed 'Q42' ; echo $?
+@end example
+@node sed scripts
+@chapter @command{sed} scripts
+@menu
+* sed script overview::      @command{sed} script overview
+* sed commands list::        @command{sed} commands summary
+* The "s" Command::          @command{sed}'s Swiss Army Knife
+* Common Commands::          Often used commands
+* Other Commands::           Less frequently used commands
+* Programming Commands::     Commands for @command{sed} gurus
+* Extended Commands::        Commands specific of @value{SSED}
+* Multiple commands syntax:: Extension for easier scripting
+@end menu
+@node sed script overview
+@section @command{sed} script overview
+@cindex @command{sed} script structure
 @cindex Script structure
 A @command{sed} program consists of one or more @command{sed} commands,
 passed in by one or more of the
 …
 options are used.
 This document will refer to ``the'' @command{sed} script;
 this is understood to mean the in-order catenation
+this is understood to mean the in-order concatenation
 of all of the @var{script}s and @var{script-file}s passed in.
+Each @code{sed} command consists of an optional address or
+address range, followed by a one-character command name
+and any additional command-specific code.
+@menu
+* Execution Cycle::          How @command{sed} works
+* Addresses::                Selecting lines with @command{sed}
+* Regular Expressions::      Overview of regular expression syntax
+* Common Commands::          Often used commands
+* The "s" Command::          @command{sed}'s Swiss Army Knife
+* Other Commands::           Less frequently used commands
+* Programming Commands::     Commands for @command{sed} gurus
+* Extended Commands::        Commands specific of @value{SSED}
+* Escapes::                  Specifying special characters
+@end menu
+@node Execution Cycle
+@section How @command{sed} Works
+@cindex Buffer spaces, pattern and hold
+@cindex Spaces, pattern and hold
+@cindex Pattern space, definition
+@cindex Hold space, definition
+@command{sed} maintains two data buffers: the active @emph{pattern} space,
+and the auxiliary @emph{hold} space. Both are initially empty.
+@command{sed} operates by performing the following cycle on each
+lines of input: first, @command{sed} reads one line from the input
+stream, removes any trailing newline, and places it in the pattern space.
+Then commands are executed; each command can have an address associated
+to it: addresses are a kind of condition code, and a command is only
+executed if the condition is verified before the command is to be
+executed.
+When the end of the script is reached, unless the @option{-n} option
+is in use, the contents of pattern space are printed out to the output
+stream, adding back the trailing newline if it was removed.@footnote{Actually,
+  if @command{sed} prints a line without the terminating newline, it will
+  nevertheless print the missing newline as soon as more text is sent to
+  the same output stream, which gives the ``least expected surprise''
+  even though it does not make commands like @samp{sed -n p} exactly
+  identical to @command{cat}.} Then the next cycle starts for the next
+input line.
+Unless special commands (like @samp{D}) are used, the pattern space is
+deleted between two cycles. The hold space, on the other hand, keeps
+its data between cycles (see commands @samp{h}, @samp{H}, @samp{x},
+@samp{g}, @samp{G} to move data between both buffers).
+@node Addresses
+@section Selecting lines with @command{sed}
+@cindex Addresses, in @command{sed} scripts
+@cindex Line selection
+@cindex Selecting lines to process
+Addresses in a @command{sed} script can be in any of the following forms:
+@xref{Overview}.
+@cindex @command{sed} commands syntax
+@cindex syntax, @command{sed} commands
+@cindex addresses, syntax
+@cindex syntax, addresses
+@command{sed} commands follow this syntax:
+@example
+[addr]@var{X}[options]
+@end example
+@var{X} is a single-letter @command{sed} command.
+@c TODO: add @pxref{commands} when there is a command-list section.
+@code{[addr]} is an optional line address. If @code{[addr]} is specified,
+the command @var{X} will be executed only on the matched lines.
+@code{[addr]} can be a single line number, a regular expression,
+or a range of lines (@pxref{sed addresses}).
+Additional @code{[options]} are used for some @command{sed} commands.
+@cindex @command{d}, example
+@cindex address range, example
+@cindex example, address range
+The following example deletes  lines 30 to 35 in the input.
+@code{30,35} is an address range. @command{d} is the delete command:
+@example
+sed '30,35d' input.txt > output.txt
+@end example
+@cindex @command{q}, example
+@cindex regular expression, example
+@cindex example, regular expression
+The following example prints all input until a line
+starting with the string @samp{foo} is found. If such line is found,
+@command{sed} will terminate with exit status 42.
+If such line was not found (and no other error occurred), @command{sed}
+will exit with status 0.
+@code{/^foo/} is a regular-expression address.
+@command{q} is the quit command. @code{42} is the command option.
+@example
+sed '/^foo/q42' input.txt > output.txt
+@end example
+@cindex multiple @command{sed} commands
+@cindex @command{sed} commands, multiple
+@cindex newline, command separator
+@cindex semicolons, command separator
+@cindex ;, command separator
+@cindex -e, example
+@cindex -f, example
+Commands within a @var{script} or @var{script-file} can be
+separated by semicolons (@code{;}) or newlines (ASCII 10).
+Multiple scripts can be specified with @option{-e} or @option{-f}
+options.
+The following examples are all equivalent. They perform two @command{sed}
+operations: deleting any lines matching the regular expression @code{/^foo/},
+and replacing all occurrences of the string @samp{hello} with @samp{world}:
+@example
+sed '/^foo/d ; s/hello/world/g' input.txt > output.txt
+sed -e '/^foo/d' -e 's/hello/world/g' input.txt > output.txt
+echo '/^foo/d' > script.sed
+echo 's/hello/world/g' >> script.sed
+sed -f script.sed input.txt > output.txt
+echo 's/hello/world/g' > script2.sed
+sed -e '/^foo/d' -f script2.sed input.txt > output.txt
+@end example
+@cindex @command{a}, and semicolons
+@cindex @command{c}, and semicolons
+@cindex @command{i}, and semicolons
+Commands @command{a}, @command{c}, @command{i}, due to their syntax,
+cannot be followed by semicolons working as command separators and
+thus should be terminated
+with newlines or be placed at the end of a @var{script} or @var{script-file}.
+Commands can also be preceded with optional non-significant
+whitespace characters.
+@xref{Multiple commands syntax}.
+@node sed commands list
+@section @command{sed} commands summary
+The following commands are supported in @value{SSED}.
+Some are standard POSIX commands, while other are @value{SSEDEXT}.
+Details and examples for each command are in the following sections.
+(Mnemonics) are shown in parentheses.
 @table @code
+@item @var{number}
+@cindex Address, numeric
+@cindex Line, selecting by number
+Specifying a line number will match only that line in the input.
+(Note that @command{sed} counts lines continuously across all input files
+unless @option{-i} or @option{-s} options are specified.)
+@item @var{first}~@var{step}
+@cindex @acronym{GNU} extensions, @samp{@var{n}~@var{m}} addresses
+This @acronym{GNU} extension matches every @var{step}th line
+starting with line @var{first}.
+In particular, lines will be selected when there exists
+a non-negative @var{n} such that the current line-number equals
+@var{first} + (@var{n} * @var{step}).
+Thus, to select the odd-numbered lines,
+one would use @code{1~2};
+to pick every third line starting with the second, @samp{2~3} would be used;
+to pick every fifth line starting with the tenth, use @samp{10~5};
+and @samp{50~0} is just an obscure way of saying @code{50}.
+@item $
+@cindex Address, last line
+@cindex Last line, selecting
+@cindex Line, selecting last
+This address matches the last line of the last file of input, or
+the last line of each file when the @option{-i} or @option{-s} options
+are specified.
+@item /@var{regexp}/
+@cindex Address, as a regular expression
+@cindex Line, selecting by regular expression match
+This will select any line which matches the regular expression @var{regexp}.
+If @var{regexp} itself includes any @code{/} characters,
+each must be escaped by a backslash (@code{\}).
+@cindex empty regular expression
+@cindex @value{SSEDEXT}, modifiers and the empty regular expression
+The empty regular expression @samp{//} repeats the last regular
+expression match (the same holds if the empty regular expression is
+passed to the @code{s} command).  Note that modifiers to regular expressions
+are evaluated when the regular expression is compiled, thus it is invalid to
+specify them together with the empty regular expression.
+@item \%@var{regexp}%
+(The @code{%} may be replaced by any other single character.)
+@cindex Slash character, in regular expressions
+This also matches the regular expression @var{regexp},
+but allows one to use a different delimiter than @code{/}.
+This is particularly useful if the @var{regexp} itself contains
+a lot of slashes, since it avoids the tedious escaping of every @code{/}.
+If @var{regexp} itself includes any delimiter characters,
+each must be escaped by a backslash (@code{\}).
+@item /@var{regexp}/I
+@itemx \%@var{regexp}%I
+@cindex @acronym{GNU} extensions, @code{I} modifier
+@ifset PERL
+@cindex Perl-style regular expressions, case-insensitive
+@end ifset
+The @code{I} modifier to regular-expression matching is a @acronym{GNU}
+extension which causes the @var{regexp} to be matched in
+a case-insensitive manner.
+@item /@var{regexp}/M
+@itemx \%@var{regexp}%M
+@ifset PERL
+@cindex @value{SSEDEXT}, @code{M} modifier
+@end ifset
+@cindex Perl-style regular expressions, multiline
+The @code{M} modifier to regular-expression matching is a @value{SSED}
+extension which causes @code{^} and @code{$} to match respectively
+(in addition to the normal behavior) the empty string after a newline,
+and the empty string before a newline.  There are special character
+sequences
+@ifset PERL
+(@code{\A} and @code{\Z} in Perl mode, @code{\`} and @code{\'}
+in basic or extended regular expression modes)
+@end ifset
+@ifclear PERL
+(@code{\`} and @code{\'})
+@end ifclear
+which always match the beginning or the end of the buffer.
+@code{M} stands for @cite{multi-line}.
+@ifset PERL
+@item /@var{regexp}/S
+@itemx \%@var{regexp}%S
+@cindex @value{SSEDEXT}, @code{S} modifier
+@cindex Perl-style regular expressions, single line
+The @code{S} modifier to regular-expression matching is only valid
+in Perl mode and specifies that the dot character (@code{.}) will
+match the newline character too.  @code{S} stands for @cite{single-line}.
+@end ifset
+@ifset PERL
+@item /@var{regexp}/X
+@itemx \%@var{regexp}%X
+@cindex @value{SSEDEXT}, @code{X} modifier
+@cindex Perl-style regular expressions, extended
+The @code{X} modifier to regular-expression matching is also
+valid in Perl mode only.  If it is used, whitespace in the
+pattern (other than in a character class) and
+characters between a @kbd{#} outside a character class and the
+next newline character are ignored. An escaping backslash
+can be used to include a whitespace or @kbd{#} character as part
+of the pattern.
+@end ifset
+@end table
+If no addresses are given, then all lines are matched;
+if one address is given, then only lines matching that
+address are matched.
+@cindex Range of lines
+@cindex Several lines, selecting
+An address range can be specified by specifying two addresses
+separated by a comma (@code{,}).  An address range matches lines
+starting from where the first address matches, and continues
+until the second address matches (inclusively).
+If the second address is a @var{regexp}, then checking for the
+ending match will start with the line @emph{following} the
+line which matched the first address: a range will always
+span at least two lines (except of course if the input stream
+ends).
+If the second address is a @var{number} less than (or equal to)
+the line matching the first address, then only the one line is
+matched.
+@cindex Special addressing forms
+@cindex Range with start address of zero
+@cindex Zero, as range start address
+@cindex @var{addr1},+N
+@cindex @var{addr1},~N
+@cindex @acronym{GNU} extensions, special two-address forms
+@cindex @acronym{GNU} extensions, @code{0} address
+@cindex @acronym{GNU} extensions, 0,@var{addr2} addressing
+@cindex @acronym{GNU} extensions, @var{addr1},+@var{N} addressing
+@cindex @acronym{GNU} extensions, @var{addr1},~@var{N} addressing
+@value{SSED} also supports some special two-address forms; all these
+are @acronym{GNU} extensions:
+@table @code
+@item 0,/@var{regexp}/
+A line number of @code{0} can be used in an address specification like
+@code{0,/@var{regexp}/} so that @command{sed} will try to match
+@var{regexp} in the first input line too.  In other words,
+@code{0,/@var{regexp}/} is similar to @code{1,/@var{regexp}/},
+except that if @var{addr2} matches the very first line of input the
+@code{0,/@var{regexp}/} form will consider it to end the range, whereas
+the @code{1,/@var{regexp}/} form will match the beginning of its range and
+hence make the range span up to the @emph{second} occurrence of the
+regular expression.
+Note that this is the only place where the @code{0} address makes
+sense; there is no 0-th line and commands which are given the @code{0}
+address in any other way will give an error.
+@item @var{addr1},+@var{N}
+Matches @var{addr1} and the @var{N} lines following @var{addr1}.
+@item @var{addr1},~@var{N}
+Matches @var{addr1} and the lines following @var{addr1}
+until the next line whose input line number is a multiple of @var{N}.
+@end table
+@cindex Excluding lines
+@cindex Selecting non-matching lines
+Appending the @code{!} character to the end of an address
+specification negates the sense of the match.
+That is, if the @code{!} character follows an address range,
+then only lines which do @emph{not} match the address range
+will be selected.
+This also works for singleton addresses,
+and, perhaps perversely, for the null address.
+@node Regular Expressions
+@section Overview of Regular Expression Syntax
+To know how to use @command{sed}, people should understand regular
+expressions (@dfn{regexp} for short).  A regular expression
+is a pattern that is matched against a
+subject string from left to right.  Most characters are
+@dfn{ordinary}: they stand for
+themselves in a pattern, and match the corresponding characters
+in the subject.  As a trivial example, the pattern
+@example
+     The quick brown fox
+@end example
+@noindent
+matches a portion of a subject string that is identical to
+itself.  The power of regular expressions comes from the
+ability to include alternatives and repetitions in the pattern.
+These are encoded in the pattern by the use of @dfn{special characters},
+which do not stand for themselves but instead
+are interpreted in some special way.  Here is a brief description
+of regular expression syntax as used in @command{sed}.
+@table @code
+@item @var{char}
+A single ordinary character matches itself.
+@item *
+@cindex @acronym{GNU} extensions, to basic regular expressions
+Matches a sequence of zero or more instances of matches for the
+preceding regular expression, which must be an ordinary character, a
+special character preceded by @code{\}, a @code{.}, a grouped regexp
+(see below), or a bracket expression.  As a @acronym{GNU} extension, a
+postfixed regular expression can also be followed by @code{*}; for
+example, @code{a**} is equivalent to @code{a*}.  @acronym{POSIX}
+.1-2001 says that @code{*} stands for itself when it appears at
+the start of a regular expression or subexpression, but many
+non@acronym{GNU} implementations do not support this and portable
+scripts should instead use @code{\*} in these contexts.
+@item \+
+@cindex @acronym{GNU} extensions, to basic regular expressions
+As @code{*}, but matches one or more.  It is a @acronym{GNU} extension.
+@item \?
+@cindex @acronym{GNU} extensions, to basic regular expressions
+As @code{*}, but only matches zero or one.  It is a @acronym{GNU} extension.
+@item \@{@var{i}\@}
+As @code{*}, but matches exactly @var{i} sequences (@var{i} is a
+decimal integer; for portability, keep it between 0 and 255
+inclusive).
+@item \@{@var{i},@var{j}\@}
+Matches between @var{i} and @var{j}, inclusive, sequences.
+@item \@{@var{i},\@}
+Matches more than or equal to @var{i} sequences.
+@item \(@var{regexp}\)
+Groups the inner @var{regexp} as a whole, this is used to:
+@itemize @bullet
+@item
+@cindex @acronym{GNU} extensions, to basic regular expressions
+Apply postfix operators, like @code{\(abcd\)*}:
+this will search for zero or more whole sequences
+of @samp{abcd}, while @code{abcd*} would search
+for @samp{abc} followed by zero or more occurrences
+of @samp{d}.  Note that support for @code{\(abcd\)*} is
+required by @acronym{POSIX} 1003.1-2001, but many non-@acronym{GNU}
+implementations do not support it and hence it is not universally
+portable.
+@item
+Use back references (see below).
+@end itemize
+@item .
+Matches any character, including newline.
+@item ^
+Matches the null string at beginning of line, i.e. what
+appears after the circumflex must appear at the
+beginning of line. @code{^#include} will match only
+lines where @samp{#include} is the first thing on line---if
+there are spaces before, for example, the match fails.
+@code{^} acts as a special character only at the beginning
+of the regular expression or subexpression (that is,
+after @code{\(} or @code{\|}).  Portable scripts should avoid
+@code{^} at the beginning of a subexpression, though, as
+@acronym{POSIX} allows implementations that treat @code{^} as
+an ordinary character in that context.
+@item $
+It is the same as @code{^}, but refers to end of line.
+@code{$} also acts as a special character only at the end
+of the regular expression or subexpression (that is, before @code{\)}
+or @code{\|}), and its use at the end of a subexpression is not
+portable.
+@item [@var{list}]
+@itemx [^@var{list}]
+Matches any single character in @var{list}: for example,
+@code{[aeiou]} matches all vowels.  A list may include
+sequences like @code{@var{char1}-@var{char2}}, which
+matches any character between (inclusive) @var{char1}
+and @var{char2}.
+A leading @code{^} reverses the meaning of @var{list}, so that
+it matches any single character @emph{not} in @var{list}.  To include
+@code{]} in the list, make it the first character (after
+the @code{^} if needed), to include @code{-} in the list,
+make it the first or last; to include @code{^} put
+it after the first character.
+@cindex @code{POSIXLY_CORRECT} behavior, bracket expressions
+The characters @code{$}, @code{*}, @code{.}, @code{[}, and @code{\}
+are normally not special within @var{list}.  For example, @code{[\*]}
+matches either @samp{\} or @samp{*}, because the @code{\} is not
+special here.  However, strings like @code{[.ch.]}, @code{[=a=]}, and
+@code{[:space:]} are special within @var{list} and represent collating
+symbols, equivalence classes, and character classes, respectively, and
+@code{[} is therefore special within @var{list} when it is followed by
+@code{.}, @code{=}, or @code{:}.  Also, when not in
+@env{POSIXLY_CORRECT} mode, special escapes like @code{\n} and
+@code{\t} are recognized within @var{list}.  @xref{Escapes}.
+@item @var{regexp1}\|@var{regexp2}
+@cindex @acronym{GNU} extensions, to basic regular expressions
+Matches either @var{regexp1} or @var{regexp2}.  Use
+parentheses to use complex alternative regular expressions.
+The matching process tries each alternative in turn, from
+left to right, and the first one that succeeds is used.
+It is a @acronym{GNU} extension.
+@item @var{regexp1}@var{regexp2}
+Matches the concatenation of @var{regexp1} and @var{regexp2}.
+Concatenation binds more tightly than @code{\|}, @code{^}, and
+@code{$}, but less tightly than the other regular expression
+operators.
+@item \@var{digit}
+Matches the @var{digit}-th @code{\(@dots{}\)} parenthesized
+subexpression in the regular expression.  This is called a @dfn{back
+reference}.  Subexpressions are implicity numbered by counting
+occurrences of @code{\(} left-to-right.
+@item \n
+Matches the newline character.
+@item \@var{char}
+Matches @var{char}, where @var{char} is one of @code{$},
+@code{*}, @code{.}, @code{[}, @code{\}, or @code{^}.
+Note that the only C-like
+backslash sequences that you can portably assume to be
+interpreted are @code{\n} and @code{\\}; in particular
+@code{\t} is not portable, and matches a @samp{t} under most
+implementations of @command{sed}, rather than a tab character.
+@end table
+@cindex Greedy regular expression matching
+Note that the regular expression matcher is greedy, i.e., matches
+are attempted from left to right and, if two or more matches are
+possible starting at the same character, it selects the longest.
+@noindent
+Examples:
+@table @samp
+@item abcdef
+Matches @samp{abcdef}.
+@item a*b
+Matches zero or more @samp{a}s followed by a single
+@samp{b}.  For example, @samp{b} or @samp{aaaaab}.
+@item a\?b
+Matches @samp{b} or @samp{ab}.
+@item a\+b\+
+Matches one or more @samp{a}s followed by one or more
+@samp{b}s: @samp{ab} is the shortest possible match, but
+other examples are @samp{aaaab} or @samp{abbbbb} or
+@samp{aaaaaabbbbbbb}.
+@item .*
+@itemx .\+
+These two both match all the characters in a string;
+however, the first matches every string (including the empty
+string), while the second matches only strings containing
+at least one character.
+@item ^main.*(.*)
+his matches a string starting with @samp{main},
+followed by an opening and closing
+parenthesis.  The @samp{n}, @samp{(} and @samp{)} need not
+be adjacent.
+@item ^#
+This matches a string beginning with @samp{#}.
+@item \\$
+This matches a string ending with a single backslash.  The
+regexp contains two backslashes for escaping.
+@item \$
+Instead, this matches a string consisting of a single dollar sign,
+because it is escaped.
+@item [a-zA-Z0-9]
+In the C locale, this matches any @acronym{ASCII} letters or digits.
+@item [^ @kbd{tab}]\+
+(Here @kbd{tab} stands for a single tab character.)
+This matches a string of one or more
+characters, none of which is a space or a tab.
+Usually this means a word.
+@item ^\(.*\)\n\1$
+This matches a string consisting of two equal substrings separated by
+a newline.
+@item .\@{9\@}A$
+This matches nine characters followed by an @samp{A}.
+@item ^.\@{15\@}A
+This matches the start of a string that contains 16 characters,
+the last of which is an @samp{A}.
+@end table
+@node Common Commands
+@section Often-Used Commands
+If you use @command{sed} at all, you will quite likely want to know
+these commands.
+@table @code
+@item #
+[No addresses allowed.]
+@findex # (comments)
+@cindex Comments, in scripts
+The @code{#} character begins a comment;
+the comment continues until the next newline.
+@cindex Portability, comments
+If you are concerned about portability, be aware that
+some implementations of @command{sed} (which are not @sc{posix}
+conformant) may only support a single one-line comment,
+and then only when the very first character of the script is a @code{#}.
+@findex -n, forcing from within a script
+@cindex Caveat --- #n on first line
+Warning: if the first two characters of the @command{sed} script
+are @code{#n}, then the @option{-n} (no-autoprint) option is forced.
+If you want to put a comment in the first line of your script
+and that comment begins with the letter @samp{n}
+and you do not want this behavior,
+then be sure to either use a capital @samp{N},
+or place at least one space before the @samp{n}.
+@item q [@var{exit-code}]
+This command only accepts a single address.
+@findex q (quit) command
+@cindex @value{SSEDEXT}, returning an exit code
+@cindex Quitting
+Exit @command{sed} without processing any more commands or input.
+Note that the current pattern space is printed if auto-print is
+not disabled with the @option{-n} options.  The ability to return
+an exit code from the @command{sed} script is a @value{SSED} extension.
+@item a\
+@itemx @var{text}
+Append @var{text} after a line.
+@item a @var{text}
+Append @var{text} after a line (alternative syntax).
+@item b @var{label}
+Branch unconditionally to @var{label}.
+The @var{label} may be omitted, in which case the next cycle is started.
+@item c\
+@itemx @var{text}
+Replace (change) lines with @var{text}.
+@item c @var{text}
+Replace (change) lines with @var{text} (alternative syntax).
 @item d
-@findex d (delete) command
-@cindex Text, deleting
 Delete the pattern space;
 immediately start next cycle.
+@item p
+@findex p (print) command
+@cindex Text, printing
+Print out the pattern space (to the standard output).
+This command is usually only used in conjunction with the @option{-n}
+command-line option.
+@item D
+If pattern space contains newlines, delete text in the pattern
+space up to the first newline, and restart cycle with the resultant
+pattern space, without reading a new line of input.
+If pattern space contains no newline, start a normal new cycle as if
+the @code{d} command was issued.
+@c TODO: add a section about D+N and D+n commands
+@item e
+Executes the command that is found in pattern space and
+replaces the pattern space with the output; a trailing newline
+is suppressed.
+@item e @var{command}
+Executes @var{command} and sends its output to the output stream.
+The command can run across multiple lines, all but the last ending with
+a back-slash.
+@item F
+(filename) Print the file name of the current input file (with a trailing
+newline).
+@item g
+Replace the contents of the pattern space with the contents of the hold space.
+@item G
+Append a newline to the contents of the pattern space,
+and then append the contents of the hold space to that of the pattern space.
+@item h
+(hold) Replace the contents of the hold space with the contents of the
+pattern space.
+@item H
+Append a newline to the contents of the hold space,
+and then append the contents of the pattern space to that of the hold space.
+@item i\
+@itemx @var{text}
+insert @var{text} before a line.
+@item i @var{text}
+insert @var{text} before a line (alternative syntax).
+@item l
+Print the pattern space in an unambiguous form.
 @item n
+@findex n (next-line) command
+@cindex Next input line, replace pattern space with
+@cindex Read next input line
+If auto-print is not disabled, print the pattern space,
+(next) If auto-print is not disabled, print the pattern space,
 then, regardless, replace the pattern space with the next line of input.
 If there is no more input then @command{sed} exits without processing
 any more commands.
+@item @{ @var{commands} @}
+@findex @{@} command grouping
+@cindex Grouping commands
+@cindex Command groups
+A group of commands may be enclosed between
+@code{@{} and @code{@}} characters.
+This is particularly useful when you want a group of commands
+to be triggered by a single address (or address-range) match.
+@item N
+Add a newline to the pattern space,
+then append the next line of input to the pattern space.
+If there is no more input then @command{sed} exits without processing
+any more commands.
+@item p
+Print the pattern space.
+@c useful with @option{-n}
+@item P
+Print the pattern space, up to the first <newline>.
+@item q@var{[exit-code]}
+(quit) Exit @command{sed} without processing any more commands or input.
+@item Q@var{[exit-code]}
+(quit) This command is the same as @code{q}, but will not print the
+contents of pattern space.  Like @code{q}, it provides the
+ability to return an exit code to the caller.
+@c useful to quit on a conditional without printing
+@item r filename
+Reads file @var{filename}.
+@item R filename
+Queue a line of @var{filename} to be read and
+inserted into the output stream at the end of the current cycle,
+or when the next input line is read.
+@c useful to interleave files
+@item s@var{/regexp/replacement/[flags]}
+(substitute) Match the regular-expression against the content of the
+pattern space.  If found, replace matched string with
+@var{replacement}.
+@item t @var{label}
+(test) Branch to @var{label} only if there has been a successful
+@code{s}ubstitution since the last input line was read or conditional
+branch was taken.  The @var{label} may be omitted, in which case the
+next cycle is started.
+@item T @var{label}
+(test) Branch to @var{label} only if there have been no successful
+@code{s}ubstitutions since the last input line was read or
+conditional branch was taken. The @var{label} may be omitted,
+in which case the next cycle is started.
+@item v @var{[version]}
+(version) This command does nothing, but makes @command{sed} fail if
+@value{SSED} extensions are not supported, or if the requested version
+is not available.
+@item w filename
+Write the pattern space to @var{filename}.
+@item W filename
+Write to the given filename the portion of the pattern space up to
+the first newline
+@item x
+Exchange the contents of the hold and pattern spaces.
+@item y/src/dst/
+Transliterate any characters in the pattern space which match
+any of the @var{source-chars} with the corresponding character
+in @var{dest-chars}.
+@item z
+(zap) This command empties the content of pattern space.
+@item #
+A comment, until  the next newline.
+@item @{ @var{cmd ; cmd ...} @}
+Group several commands together.
+@c useful for multiple commands on same address
+@item =
+Print the current input line number (with a trailing newline).
+@item : @var{label}
+Specify the location of @var{label} for branch commands (@code{b},
+@code{t}, @code{T}).
 @end table
 @node The "s" Command
 @section The @code{s} Command
+The syntax of the @code{s} (as in substitute) command is
+@samp{s/@var{regexp}/@var{replacement}/@var{flags}}.  The @code{/}
+characters may be uniformly replaced by any other single
+character within any given @code{s} command.  The @code{/}
+character (or whatever other character is used in its stead)
+can appear in the @var{regexp} or @var{replacement}
+only if it is preceded by a @code{\} character.
+The @code{s} command is probably the most important in @command{sed}
+and has a lot of different options.  Its basic concept is simple:
+the @code{s} command attempts to match the pattern
+space against the supplied @var{regexp}; if the match is
+successful, then that portion of the pattern
+space which was matched is replaced with @var{replacement}.
+The @code{s} command (as in substitute) is probably the most important
+in @command{sed} and has a lot of different options.  The syntax of
+the @code{s} command is
+@samp{s/@var{regexp}/@var{replacement}/@var{flags}}.
+Its basic concept is simple: the @code{s} command attempts to match
+the pattern space against the supplied regular expression @var{regexp};
+if the match is successful, then that portion of the
+pattern space which was matched is replaced with @var{replacement}.
+For details about @var{regexp} syntax @pxref{Regexp Addresses,,Regular
+Expression Addresses}.
 @cindex Backreferences, in regular expressions
 …
 characters which reference the whole matched portion
 of the pattern space.
+@c TODO: xref to backreference section mention @var{\'}.
+The @code{/}
+characters may be uniformly replaced by any other single
+character within any given @code{s} command.  The @code{/}
+character (or whatever other character is used in its stead)
+can appear in the @var{regexp} or @var{replacement}
+only if it is preceded by a @code{\} character.
 @cindex @value{SSEDEXT}, case modifiers in @code{s} commands
 Finally, as a @value{SSED} extension, you can include a
 …
 Stop case conversion started by @code{\L} or @code{\U}.
 @end table
+When the @code{g} flag is being used, case conversion does not
+propagate from one occurrence of the regular expression to
+another.  For example, when the following command is executed
+with @samp{a-b-} in pattern space:
+@example
+s/\(b\?\)-/x\u\1/g
+@end example
+@noindent
+the output is @samp{axxB}.  When replacing the first @samp{-},
+the @samp{\u} sequence only affects the empty replacement of
+@samp{\1}.  It does not affect the @code{x} character that is
+added to pattern space when replacing @code{b-} with @code{xB}.
+On the other hand, @code{\l} and @code{\u} do affect the remainder
+of the replacement text if they are followed by an empty substitution.
+With @samp{a-b-} in pattern space, the following command:
+@example
+s/\(b\?\)-/\u\1x/g
+@end example
+@noindent
+will replace @samp{-} with @samp{X} (uppercase) and @samp{b-} with
+@samp{Bx}.  If this behavior is undesirable, you can prevent it by
+adding a @samp{\E} sequence---after @samp{\1} in this case.
 To include a literal @code{\}, @code{&}, or newline in the final
 …
 Only replace the @var{number}th match of the @var{regexp}.
+@cindex @acronym{GNU} extensions, @code{g} and @var{number} modifier interaction in @code{s} command
+@cindex GNU extensions, @code{g} and @var{number} modifier
+interaction in @code{s} command
 @cindex Mixing @code{g} and @var{number} modifiers in the @code{s} command
 Note: the @sc{posix} standard does not specify what should happen
 …
 change in future versions.
 @item w @var{file-name}
+@item w @var{filename}
 @cindex Text, writing to a file after substitution
 @cindex @value{SSEDEXT}, @file{/dev/stdout} file
 @cindex @value{SSEDEXT}, @file{/dev/stderr} file
 If the substitution was made, then write out the result to the named file.
 As a @value{SSED} extension, two special values of @var{file-name} are
+As a @value{SSED} extension, two special values of @var{filename} are
 supported: @file{/dev/stderr}, which writes the result to the standard
 error, and @file{/dev/stdout}, which writes to the standard
 …
 @item I
 @itemx i
 @cindex @acronym{GNU} extensions, @code{I} modifier
+@cindex GNU extensions, @code{I} modifier
 @cindex Case-insensitive matching
+@ifset PERL
+@cindex Perl-style regular expressions, case-insensitive
+@end ifset
+The @code{I} modifier to regular-expression matching is a @acronym{GNU}
+The @code{I} modifier to regular-expression matching is a GNU
 extension which makes @command{sed} match @var{regexp} in a
 case-insensitive manner.
 …
 @itemx m
 @cindex @value{SSEDEXT}, @code{M} modifier
-@ifset PERL
-@cindex Perl-style regular expressions, multiline
-@end ifset
 The @code{M} modifier to regular-expression matching is a @value{SSED}
+extension which causes @code{^} and @code{$} to match respectively
+(in addition to the normal behavior) the empty string after a newline,
+and the empty string before a newline.  There are special character
+sequences
+@ifset PERL
+(@code{\A} and @code{\Z} in Perl mode, @code{\`} and @code{\'}
+in basic or extended regular expression modes)
+@end ifset
+extension which directs @value{SSED} to match the regular expression
+in @cite{multi-line} mode.  The modifier causes @code{^} and @code{$} to
+match respectively (in addition to the normal behavior) the empty string
+after a newline, and the empty string before a newline.  There are
+special character sequences
 @ifclear PERL
 (@code{\`} and @code{\'})
 @end ifclear
 which always match the beginning or the end of the buffer.
+@code{M} stands for @cite{multi-line}.
+@ifset PERL
+@item S
+@itemx s
+@cindex @value{SSEDEXT}, @code{S} modifier
+@cindex Perl-style regular expressions, single line
+The @code{S} modifier to regular-expression matching is only valid
+in Perl mode and specifies that the dot character (@code{.}) will
+match the newline character too.  @code{S} stands for @cite{single-line}.
+@end ifset
+@ifset PERL
+@item X
+@itemx x
+@cindex @value{SSEDEXT}, @code{X} modifier
+@cindex Perl-style regular expressions, extended
+The @code{X} modifier to regular-expression matching is also
+valid in Perl mode only.  If it is used, whitespace in the
+pattern (other than in a character class) and
+characters between a @kbd{#} outside a character class and the
+next newline character are ignored. An escaping backslash
+can be used to include a whitespace or @kbd{#} character as part
+of the pattern.
+@end ifset
+In addition,
+the period character does not match a new-line character in
+multi-line mode.
+@end table
+@node Common Commands
+@section Often-Used Commands
+If you use @command{sed} at all, you will quite likely want to know
+these commands.
+@table @code
+@item #
+[No addresses allowed.]
+@findex # (comments)
+@cindex Comments, in scripts
+The @code{#} character begins a comment;
+the comment continues until the next newline.
+@cindex Portability, comments
+If you are concerned about portability, be aware that
+some implementations of @command{sed} (which are not @sc{posix}
+conforming) may only support a single one-line comment,
+and then only when the very first character of the script is a @code{#}.
+@findex -n, forcing from within a script
+@cindex Caveat --- #n on first line
+Warning: if the first two characters of the @command{sed} script
+are @code{#n}, then the @option{-n} (no-autoprint) option is forced.
+If you want to put a comment in the first line of your script
+and that comment begins with the letter @samp{n}
+and you do not want this behavior,
+then be sure to either use a capital @samp{N},
+or place at least one space before the @samp{n}.
+@item q [@var{exit-code}]
+@findex q (quit) command
+@cindex @value{SSEDEXT}, returning an exit code
+@cindex Quitting
+Exit @command{sed} without processing any more commands or input.
+Example: stop after printing the second line:
+@example
+$ seq 3 | sed 2q
+@end example
+This command accepts only one address.
+Note that the current pattern space is printed if auto-print is
+not disabled with the @option{-n} options.  The ability to return
+an exit code from the @command{sed} script is a @value{SSED} extension.
+See also the @value{SSED} extension @code{Q} command which quits silently
+without printing the current pattern space.
+@item d
+@findex d (delete) command
+@cindex Text, deleting
+Delete the pattern space;
+immediately start next cycle.
+Example: delete the second input line:
+@example
+$ seq 3 | sed 2d
+@end example
+@item p
+@findex p (print) command
+@cindex Text, printing
+Print out the pattern space (to the standard output).
+This command is usually only used in conjunction with the @option{-n}
+command-line option.
+Example: print only the second input line:
+@example
+$ seq 3 | sed -n 2p
+@end example
+@item n
+@findex n (next-line) command
+@cindex Next input line, replace pattern space with
+@cindex Read next input line
+If auto-print is not disabled, print the pattern space,
+then, regardless, replace the pattern space with the next line of input.
+If there is no more input then @command{sed} exits without processing
+any more commands.
+This command is useful to skip lines (e.g. process every Nth line).
+Example: perform substitution on every 3rd line (i.e. two @code{n} commands
+skip two lines):
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 6 | sed 'n;n;s/./x/'
+x
+x
+@end example
+@value{SSED} provides an extension address syntax of @var{first}~@var{step}
+to achieve the same result:
+@example
+$ seq 6 | sed '0~3s/./x/'
+x
+x
+@end example
+@codequotebacktick off
+@codequoteundirected off
+@item @{ @var{commands} @}
+@findex @{@} command grouping
+@cindex Grouping commands
+@cindex Command groups
+A group of commands may be enclosed between
+@code{@{} and @code{@}} characters.
+This is particularly useful when you want a group of commands
+to be triggered by a single address (or address-range) match.
+Example: perform substitution then print the second input line:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 3 | sed -n '2@{s/2/X/ ; p@}'
+X
+@end example
+@codequoteundirected off
+@codequotebacktick off
 @end table
 …
 @table @code
 @item y/@var{source-chars}/@var{dest-chars}/
-(The @code{/} characters may be uniformly replaced by
-any other single character within any given @code{y} command.)
 @findex y (transliterate) command
 @cindex Transliteration
 …
 in @var{dest-chars}.
+Example: transliterate @samp{a-j} into @samp{0-9}:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ echo hello world | sed 'y/abcdefghij/0123456789/'
+llo worl3
+@end example
+@codequoteundirected off
+@codequotebacktick off
+(The @code{/} characters may be uniformly replaced by
+any other single character within any given @code{y} command.)
 Instances of the @code{/} (or whatever other character is used in its stead),
 @code{\}, or newlines can appear in the @var{source-chars} or @var{dest-chars}
 …
 contain the same number of characters (after de-escaping).
+See the @command{tr} command from GNU coreutils for similar functionality.
+@item a @var{text}
+Appending @var{text} after a line. This is a GNU extension
+to the standard @code{a} command - see below for details.
+Example: Add @samp{hello} after the second line:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 3 | sed '2a hello'
+hello
+@end example
+@codequoteundirected off
+@codequotebacktick off
+Leading whitespace after the @code{a} command is ignored.
+The text to add is read until the end of the line.
 @item a\
 @itemx @var{text}
-@cindex @value{SSEDEXT}, two addresses supported by most commands
-As a @acronym{GNU} extension, this command accepts two addresses.
 @findex a (append text lines) command
 @cindex Appending text after a line
 @cindex Text, appending
+Queue the lines of text which follow this command
+Appending @var{text} after a line.
+Example: Add @samp{hello} after the second line
+(@print{} indicates printed output lines):
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 3 | sed '2a\
+hello'
+@print{}1
+@print{}2
+@print{}hello
+@print{}3
+@end example
+@codequoteundirected off
+@codequotebacktick off
+The @code{a} command queues the lines of text which follow this command
 (each but the last ending with a @code{\},
 which are removed from the output)
 …
 or when the next input line is read.
+@cindex @value{SSEDEXT}, two addresses supported by most commands
+As a GNU extension, this command accepts two addresses.
 Escape sequences in @var{text} are processed, so you should
 use @code{\\} in @var{text} to print a single backslash.
+As a @acronym{GNU} extension, if between the @code{a} and the newline there is
+other than a whitespace-@code{\} sequence, then the text of this line,
+starting at the first non-whitespace character after the @code{a},
+is taken as the first line of the @var{text} block.
+(This enables a simplification in scripting a one-line add.)
+This extension also works with the @code{i} and @code{c} commands.
+The commands resume after the last line without a backslash (@code{\}) -
+@samp{world} in the following example:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 3 | sed '2a\
+hello\
+world
+s/./X/'
+@print{}1
+@print{}2
+@print{}hello
+@print{}world
+@print{}X
+@end example
+@codequoteundirected off
+@codequotebacktick off
+As a GNU extension, the @code{a} command and @var{text} can be
+separated into two @code{-e} parameters, enabling easier scripting:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 3 | sed -e '2a\' -e hello
+hello
+$ sed -e '2a\' -e "$VAR"
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@item i @var{text}
+insert @var{text} before a line. This is a GNU extension
+to the standard @code{i} command - see below for details.
+Example: Insert @samp{hello} before the second line:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 3 | sed '2i hello'
+hello
+@end example
+@codequoteundirected off
+@codequotebacktick off
+Leading whitespace after the @code{i} command is ignored.
+The text to add is read until the end of the line.
+@anchor{insert command}
 @item i\
 @itemx @var{text}
-@cindex @value{SSEDEXT}, two addresses supported by most commands
-As a @acronym{GNU} extension, this command accepts two addresses.
 @findex i (insert text lines) command
 @cindex Inserting text before a line
 @cindex Text, insertion
+Immediately output the lines of text which follow this command
+(each but the last ending with a @code{\},
+which are removed from the output).
+Immediately output the lines of text which follow this command.
+Example: Insert @samp{hello} before the second line
+(@print{} indicates printed output lines):
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 3 | sed '2i\
+hello'
+@print{}1
+@print{}hello
+@print{}2
+@print{}3
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@cindex @value{SSEDEXT}, two addresses supported by most commands
+As a GNU extension, this command accepts two addresses.
+Escape sequences in @var{text} are processed, so you should
+use @code{\\} in @var{text} to print a single backslash.
+The commands resume after the last line without a backslash (@code{\}) -
+@samp{world} in the following example:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 3 | sed '2i\
+hello\
+world
+s/./X/'
+@print{}X
+@print{}hello
+@print{}world
+@print{}X
+@print{}X
+@end example
+@codequoteundirected off
+@codequotebacktick off
+As a GNU extension, the @code{i} command and @var{text} can be
+separated into two @code{-e} parameters, enabling easier scripting:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 3 | sed -e '2i\' -e hello
+hello
+$ sed -e '2i\' -e "$VAR"
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@item c @var{text}
+Replaces the line(s) with @var{text}. This is a GNU extension
+to the standard @code{c} command - see below for details.
+Example: Replace the 2nd to 9th lines with the word @samp{hello}:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 10 | sed '2,9c hello'
+hello
+@end example
+@codequoteundirected off
+@codequotebacktick off
+Leading whitespace after the @code{c} command is ignored.
+The text to add is read until the end of the line.
 @item c\
 …
 @cindex Replacing selected lines with other text
 Delete the lines matching the address or address-range,
+and output the lines of text which follow this command
+(each but the last ending with a @code{\},
+which are removed from the output)
+in place of the last line
+(or in place of each line, if no addresses were specified).
+and output the lines of text which follow this command.
+Example: Replace 2nd to 4th lines with the words @samp{hello} and
+@samp{world} (@print{} indicates printed output lines):
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 5 | sed '2,4c\
+hello\
+world'
+@print{}1
+@print{}hello
+@print{}world
+@print{}5
+@end example
+@codequoteundirected off
+@codequotebacktick off
+If no addresses are given, each line is replaced.
 A new cycle is started after this command is done,
 since the pattern space will have been deleted.
+In the following example, the @code{c} starts a
+new cycle and the substitution command is not performed
+on the replaced text:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 3 | sed '2c\
+hello
+s/./X/'
+@print{}X
+@print{}hello
+@print{}X
+@end example
+@codequoteundirected off
+@codequotebacktick off
+As a GNU extension, the @code{c} command and @var{text} can be
+separated into two @code{-e} parameters, enabling easier scripting:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 3 | sed -e '2c\' -e hello
+hello
+$ sed -e '2c\' -e "$VAR"
+@end example
+@codequoteundirected off
+@codequotebacktick off
 @item =
-@cindex @value{SSEDEXT}, two addresses supported by most commands
-As a @acronym{GNU} extension, this command accepts two addresses.
 @findex = (print line number) command
 @cindex Printing line number
 @cindex Line number, printing
 Print out the current input line number (with a trailing newline).
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ printf '%s\n' aaa bbb ccc | sed =
+aaa
+bbb
+ccc
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@cindex @value{SSEDEXT}, two addresses supported by most commands
+As a GNU extension, this command accepts two addresses.
 @item l @var{n}
 …
 @item r @var{filename}
-@cindex @value{SSEDEXT}, two addresses supported by most commands
-As a @acronym{GNU} extension, this command accepts two addresses.
 @findex r (read file) command
 @cindex Read text from a file
+Reads file @var{filename}. Example:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ seq 3 | sed '2r/etc/hostname'
+fencepost.gnu.org
+@end example
+@codequoteundirected off
+@codequotebacktick off
 @cindex @value{SSEDEXT}, @file{/dev/stdin} file
 Queue the contents of @var{filename} to be read and
 …
 standard input.
+@cindex @value{SSEDEXT}, two addresses supported by most commands
+As a GNU extension, this command accepts two addresses. The
+file will then be reread and inserted on each of the addressed lines.
+As a @value{SSED} extension, the @code{r} command accepts a zero address,
+inserting a file @emph{before} the first line of the input
+@pxref{Adding a header to multiple files}.
 @item w @var{filename}
 @findex w (write file) command
 …
 @cindex @value{SSEDEXT}, @file{/dev/stderr} file
 Write the pattern space to @var{filename}.
 As a @value{SSED} extension, two special values of @var{file-name} are
+As a @value{SSED} extension, two special values of @var{filename} are
 supported: @file{/dev/stderr}, which writes the result to the standard
 error, and @file{/dev/stdout}, which writes to the standard
 …
 option is being used.}
+The file will be created (or truncated) before the
+first input line is read; all @code{w} commands
+(including instances of @code{w} flag on successful @code{s} commands)
+which refer to the same @var{filename} are output without
+closing and reopening the file.
+The file will be created (or truncated) before the first input line is
+read; all @code{w} commands (including instances of the @code{w} flag
+on successful @code{s} commands) which refer to the same @var{filename}
+are output without closing and reopening the file.
 @item D
 @findex D (delete first line) command
 @cindex Delete first line from pattern space
+Delete text in the pattern space up to the first newline.
+If any text is left, restart cycle with the resultant
+pattern space (without reading a new line of input),
 otherwise start a normal new cycle.
+If pattern space contains no newline, start a normal new cycle as if
+the @code{d} command was issued.  Otherwise, delete text in the pattern
+space up to the first newline, and restart cycle with the resultant
+pattern space, without reading a new line of input.
 @item N
 …
 If there is no more input then @command{sed} exits without processing
 any more commands.
+When @option{-z} is used, a zero byte (the ascii @samp{NUL} character) is
+added between the lines (instead of a new line).
+By default @command{sed} does not terminate if there is no 'next' input line.
+This is a GNU extension which can be disabled with @option{--posix}.
+@xref{N_command_last_line,,N command on the last line}.
 @item P
 …
 If a parameter is specified, instead, the @code{e} command
 interprets it as a command and sends its output to the output stream
+(like @code{r} does).  The command can run across multiple
 lines, all but the last ending with a back-slash.
+interprets it as a command and sends its output to the output stream.
+The command can run across multiple lines, all but the last ending with
+a back-slash.
 In both cases, the results are undefined if the command to be
 executed contains a @sc{nul} character.
+@item L @var{n}
+@findex L (fLow paragraphs) command
+@cindex Reformat pattern space
+@cindex Reformatting paragraphs
+@cindex @value{SSEDEXT}, reformatting paragraphs
+@cindex @value{SSEDEXT}, @code{L} command
+This @value{SSED} extension fills and joins lines in pattern space
+to produce output lines of (at most) @var{n} characters, like
+@code{fmt} does; if @var{n} is omitted, the default as specified
+on the command line is used.  This command is considered a failed
+experiment and unless there is enough request (which seems unlikely)
+will be removed in future versions.
+@ignore
+Blank lines, spaces between words, and indentation are
+preserved in the output; successive input lines with different
+indentation are not joined; tabs are expanded to 8 columns.
+If the pattern space contains multiple lines, they are joined, but
+since the pattern space usually contains a single line, the behavior
+of a simple @code{L;d} script is the same as @samp{fmt -s} (i.e.,
+it does not join short lines to form longer ones).
+@var{n} specifies the desired line-wrap length; if omitted,
+the default as specified on the command line is used.
+@end ignore
+Note that, unlike the @code{r} command, the output of the command will
+be printed immediately; the @code{r} command instead delays the output
+to the end of the current cycle.
+@item F
+@findex F (File name) command
+@cindex Printing file name
+@cindex File name, printing
+Print out the file name of the current input file (with a trailing
+newline).
 @item Q [@var{exit-code}]
 This command only accepts a single address.
+This command accepts only one address.
 @findex Q (silent Quit) command
 …
 @example
 :eat
 $d       @i{Quit silently on the last line}
 N        @i{Read another line, silently}
 g        @i{Overwrite pattern space each time to save memory}
+$d       @i{@r{Quit silently on the last line}}
+N        @i{@r{Read another line, silently}}
+g        @i{@r{Overwrite pattern space each time to save memory}}
 b eat
 @end example
 …
 the first newline.  Everything said under the @code{w} command about
 file handling holds here too.
+@item z
+@findex z (Zap) command
+@cindex @value{SSEDEXT}, emptying pattern space
+@cindex Emptying pattern space
+This command empties the content of pattern space.  It is
+usually the same as @samp{s/.*//}, but is more efficient
+and works in the presence of invalid multibyte sequences
+in the input stream.  @sc{posix} mandates that such sequences
+are @emph{not} matched by @samp{.}, so that there is no portable
+way to clear @command{sed}'s buffers in the middle of the
+script in most multibyte locales (including UTF-8 locales).
 @end table
+@node Multiple commands syntax
+@section Multiple commands syntax
+@c POSIX says:
+@c   Editing commands other than {...}, a, b, c, i, r, t, w, :, and #
+@c   can be followed by a <semicolon>, optional <blank> characters, and
+@c   another editing command. However, when an s editing command is used
+@c   with the w flag, following it with another command in this manner
+@c   produces undefined results.
+There are several methods to specify multiple commands in a @command{sed}
+program.
+Using newlines is most natural when running a sed script from a file
+(using the @option{-f} option).
+On the command line, all @command{sed} commands may be separated by newlines.
+Alternatively, you may specify each command as an argument to an @option{-e}
+option:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 6 | sed '1d
+d
+d'
+$ seq 6 | sed -e 1d -e 3d -e 5d
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+A semicolon (@samp{;}) may be used to separate most simple commands:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 6 | sed '1d;3d;5d'
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+The @code{@{},@code{@}},@code{b},@code{t},@code{T},@code{:} commands can
+be separated with a semicolon (this is a non-portable @value{SSED} extension).
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 4 | sed '@{1d;3d@}'
+$ seq 6 | sed '@{1d;3d@};5d'
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+Labels used in @code{b},@code{t},@code{T},@code{:} commands are read
+until a semicolon.  Leading and trailing whitespace is ignored.  In
+the examples below the label is @samp{x}.  The first example works
+with @value{SSED}.  The second is a portable equivalent.  For more
+information about branching and labels @pxref{Branching and flow
+control}.
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 3 | sed '/1/b x ; s/^/=/ ; :x ; 3d'
+=2
+$ seq 3 | sed -e '/1/bx' -e 's/^/=/' -e ':x' -e '3d'
+=2
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@subsection Commands Requiring a newline
+The following commands cannot be separated by a semicolon and
+require a newline:
+@table @asis
+@item @code{a},@code{c},@code{i} (append/change/insert)
+All characters following @code{a},@code{c},@code{i} commands are taken
+as the text to append/change/insert.  Using a semicolon leads to
+undesirable results:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 2 | sed '1aHello ; 2d'
+Hello ; 2d
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+Separate the commands using @option{-e} or a newline:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 2 | sed -e 1aHello -e 2d
+Hello
+$ seq 2 | sed '1aHello
+d'
+Hello
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+Note that specifying the text to add (@samp{Hello}) immediately
+after @code{a},@code{c},@code{i} is itself a @value{SSED} extension.
+A portable, POSIX-compliant alternative is:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 2 | sed '1a\
+Hello
+d'
+Hello
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@item @code{#} (comment)
+All characters following @samp{#} until the next newline are ignored.
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 3 | sed '# this is a comment ; 2d'
+$ seq 3 | sed '# this is a comment
+d'
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@item @code{r},@code{R},@code{w},@code{W} (reading and writing files)
+The @code{r},@code{R},@code{w},@code{W} commands parse the filename
+until end of the line.  If whitespace, comments or semicolons are found,
+they will be included in the filename, leading to unexpected results:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 2 | sed '1w hello.txt ; 2d'
+$ ls -log
+total 4
+-rw-rw-r-- 1 2 Jan 23 23:03 hello.txt ; 2d
+$ cat 'hello.txt ; 2d'
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+Note that @command{sed} silently ignores read/write errors in
+@code{r},@code{R},@code{w},@code{W} commands (such as missing files).
+In the following example, @command{sed} tries to read a file named
+@samp{@file{hello.txt ; N}}. The file is missing, and the error is silently
+ignored:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ echo x | sed '1rhello.txt ; N'
+x
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@item @code{e} (command execution)
+Any characters following the @code{e} command until the end of the line
+will be sent to the shell.  If whitespace, comments or semicolons are found,
+they will be included in the shell command, leading to unexpected results:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ echo a | sed '1e touch foo#bar'
+a
+$ ls -1
+foo#bar
+$ echo a | sed '1e touch foo ; s/a/b/'
+sh: 1: s/a/b/: not found
+a
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@item @code{s///[we]} (substitute with @code{e} or @code{w} flags)
+In a substitution command, the @code{w} flag writes the substitution
+result to a file, and the @code{e} flag executes the substitution result
+as a shell command.  As with the @code{r/R/w/W/e} commands, these
+must be terminated with a newline.  If whitespace, comments or semicolons
+are found, they will be included in the shell command or filename, leading to
+unexpected results:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ echo a | sed 's/a/b/w1.txt#foo'
+b
+$ ls -1
+.txt#foo
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@end table
+@node sed addresses
+@chapter Addresses: selecting lines
+@menu
+* Addresses overview::                Addresses overview
+* Numeric Addresses::                 selecting lines by numbers
+* Regexp Addresses::                  selecting lines by text matching
+* Range Addresses::                   selecting a range of lines
+* Zero Address::                      Using address @code{0}
+@end menu
+@node Addresses overview
+@section Addresses overview
+@cindex addresses, numeric
+@cindex numeric addresses
+Addresses determine on which line(s) the @command{sed} command will be
+executed. The following command replaces any first occurrence of @samp{hello}
+with @samp{world} only on line 144:
+@codequoteundirected on
+@codequotebacktick on
+@example
+sed '144s/hello/world/' input.txt > output.txt
+@end example
+@codequoteundirected off
+@codequotebacktick off
+If no address is specified, the command is performed on all lines.
+The following command replaces @samp{hello} with @samp{world},
+targeting every line of the input file.
+However, note that it modifies only the first instance of @samp{hello}
+on each line.
+Use the @samp{g} modifier to affect every instance on each affected line.
+@codequoteundirected on
+@codequotebacktick on
+@example
+sed 's/hello/world/' input.txt > output.txt
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@cindex addresses, regular expression
+@cindex regular expression addresses
+Addresses can contain regular expressions to match lines based
+on content instead of line numbers. The following command replaces
+@samp{hello} with @samp{world} only on lines
+containing the string @samp{apple}:
+@codequoteundirected on
+@codequotebacktick on
+@example
+sed '/apple/s/hello/world/' input.txt > output.txt
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@cindex addresses, range
+@cindex range addresses
+An address range is specified with two addresses separated by a comma
+(@code{,}). Addresses can be numeric, regular expressions, or a mix of
+both.
+The following command replaces @samp{hello} with @samp{world}
+only on lines 4 to 17 (inclusive):
+@codequoteundirected on
+@codequotebacktick on
+@example
+sed '4,17s/hello/world/' input.txt > output.txt
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@cindex Excluding lines
+@cindex Selecting non-matching lines
+@cindex addresses, negating
+@cindex addresses, excluding
+Appending the @code{!} character to the end of an address
+specification (before the command letter) negates the sense of the
+match.  That is, if the @code{!} character follows an address or an
+address range, then only lines which do @emph{not} match the addresses
+will be selected. The following command replaces @samp{hello}
+with @samp{world} only on lines @emph{not} containing the string
+@samp{apple}:
+@example
+sed '/apple/!s/hello/world/' input.txt > output.txt
+@end example
+The following command replaces @samp{hello} with
+@samp{world} only on lines 1 to 3 and from line 18 to the last line of the
+input file (i.e. excluding lines 4 to 17):
+@example
+sed '4,17!s/hello/world/' input.txt > output.txt
+@end example
+@node Numeric Addresses
+@section Selecting lines by numbers
+@cindex Addresses, in @command{sed} scripts
+@cindex Line selection
+@cindex Selecting lines to process
+Addresses in a @command{sed} script can be in any of the following forms:
+@table @code
+@item @var{number}
+@cindex Address, numeric
+@cindex Line, selecting by number
+Specifying a line number will match only that line in the input.
+(Note that @command{sed} counts lines continuously across all input files
+unless @option{-i} or @option{-s} options are specified.)
+@item $
+@cindex Address, last line
+@cindex Last line, selecting
+@cindex Line, selecting last
+This address matches the last line of the last file of input, or
+the last line of each file when the @option{-i} or @option{-s} options
+are specified.
+@item @var{first}~@var{step}
+@cindex GNU extensions, @samp{@var{n}~@var{m}} addresses
+This GNU extension matches every @var{step}th line
+starting with line @var{first}.
+In particular, lines will be selected when there exists
+a non-negative @var{n} such that the current line-number equals
+@var{first} + (@var{n} * @var{step}).
+Thus, one would use @code{1~2} to select the odd-numbered lines and
+@code{0~2} for even-numbered lines;
+to pick every third line starting with the second, @samp{2~3} would be used;
+to pick every fifth line starting with the tenth, use @samp{10~5};
+and @samp{50~0} is just an obscure way of saying @code{50}.
+The following commands demonstrate the step address usage:
+@example
+$ seq 10 | sed -n '0~4p'
+$ seq 10 | sed -n '1~3p'
+@end example
+@end table
+@node Regexp Addresses
+@section selecting lines by text matching
+@value{SSED} supports the following regular expression addresses.
+The default regular expression is
+@ref{BRE syntax, , Basic Regular Expression (BRE)}.
+If @option{-E} or @option{-r} options are used, The regular expression should be
+in @ref{ERE syntax, , Extended Regular Expression (ERE)} syntax.
+@xref{BRE vs ERE}.
+@table @code
+@item /@var{regexp}/
+@cindex Address, as a regular expression
+@cindex Line, selecting by regular expression match
+This will select any line which matches the regular expression @var{regexp}.
+If @var{regexp} itself includes any @code{/} characters,
+each must be escaped by a backslash (@code{\}).
+The following command prints lines in @file{/etc/passwd}
+which end with @samp{bash}@footnote{
+There are of course many other ways to do the same,
+e.g.
+@example
+grep 'bash$' /etc/passwd
+awk -F: '$7 == "/bin/bash"' /etc/passwd
+@end example
+}:
+@example
+sed -n '/bash$/p' /etc/passwd
+@end example
+@cindex empty regular expression
+@cindex @value{SSEDEXT}, modifiers and the empty regular expression
+The empty regular expression @samp{//} repeats the last regular
+expression match (the same holds if the empty regular expression is
+passed to the @code{s} command).  Note that modifiers to regular expressions
+are evaluated when the regular expression is compiled, thus it is invalid to
+specify them together with the empty regular expression.
+@item \%@var{regexp}%
+(The @code{%} may be replaced by any other single character.)
+@cindex Slash character, in regular expressions
+This also matches the regular expression @var{regexp},
+but allows one to use a different delimiter than @code{/}.
+This is particularly useful if the @var{regexp} itself contains
+a lot of slashes, since it avoids the tedious escaping of every @code{/}.
+If @var{regexp} itself includes any delimiter characters,
+each must be escaped by a backslash (@code{\}).
+The following commands are equivalent. They print lines
+which start with @samp{/home/alice/documents/}:
+@example
+sed -n '/^\/home\/alice\/documents\//p'
+sed -n '\%^/home/alice/documents/%p'
+sed -n '\;^/home/alice/documents/;p'
+@end example
+@item /@var{regexp}/I
+@itemx \%@var{regexp}%I
+@cindex GNU extensions, @code{I} modifier
+@cindex case insensitive, regular expression
+The @code{I} modifier to regular-expression matching is a GNU
+extension which causes the @var{regexp} to be matched in
+a case-insensitive manner.
+In many other programming languages, a lower case @code{i} is used
+for case-insensitive regular expression matching. However, in @command{sed}
+the @code{i} is used for the insert command (@pxref{insert command}).
+Observe the difference between the following examples.
+In this example, @code{/b/I} is the address: regular expression with @code{I}
+modifier. @code{d} is the delete command:
+@example
+$ printf "%s\n" a b c | sed '/b/Id'
+a
+c
+@end example
+Here, @code{/b/} is the address: a regular expression.
+@code{i} is the insert command.
+@code{d} is the value to insert.
+A line with @samp{d} is then inserted above the matched line:
+@example
+$ printf "%s\n" a b c | sed '/b/id'
+a
+d
+b
+c
+@end example
+@item /@var{regexp}/M
+@itemx \%@var{regexp}%M
+@cindex @value{SSEDEXT}, @code{M} modifier
+The @code{M} modifier to regular-expression matching is a @value{SSED}
+extension which directs @value{SSED} to match the regular expression
+in @cite{multi-line} mode.  The modifier causes @code{^} and @code{$} to
+match respectively (in addition to the normal behavior) the empty string
+after a newline, and the empty string before a newline.  There are
+special character sequences
+@ifclear PERL
+(@code{\`} and @code{\'})
+@end ifclear
+which always match the beginning or the end of the buffer.
+In addition,
+the period character does not match a new-line character in
+multi-line mode.
+@end table
+@cindex regex addresses and pattern space
+@cindex regex addresses and input lines
+Regex addresses operate on the content of the current
+pattern space. If the pattern space is changed (for example with @code{s///}
+command) the regular expression matching will operate on the changed text.
+In the following example, automatic printing is disabled with
+@option{-n}.  The @code{s/2/X/} command changes lines containing
+@samp{2} to @samp{X}. The command @code{/[0-9]/p} matches
+lines with digits and prints them.
+Because the second line is changed before the @code{/[0-9]/} regex,
+it will not match and will not be printed:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 3 | sed -n 's/2/X/ ; /[0-9]/p'
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@node Range Addresses
+@section Range Addresses
+@cindex Range of lines
+@cindex Several lines, selecting
+An address range can be specified by specifying two addresses
+separated by a comma (@code{,}).  An address range matches lines
+starting from where the first address matches, and continues
+until the second address matches (inclusively):
+@example
+$ seq 10 | sed -n '4,6p'
+@end example
+If the second address is a @var{regexp}, then checking for the
+ending match will start with the line @emph{following} the
+line which matched the first address: a range will always
+span at least two lines (except of course if the input stream
+ends).
+@example
+$ seq 10 | sed -n '4,/[0-9]/p'
+@end example
+If the second address is a @var{number} less than (or equal to)
+the line matching the first address, then only the one line is
+matched:
+@example
+$ seq 10 | sed -n '4,1p'
+@end example
+@anchor{Zero Address Regex Range}
+@cindex Special addressing forms
+@cindex Range with start address of zero
+@cindex Zero, as range start address
+@cindex @var{addr1},+N
+@cindex @var{addr1},~N
+@cindex GNU extensions, special two-address forms
+@cindex GNU extensions, @code{0} address
+@cindex GNU extensions, 0,@var{addr2} addressing
+@cindex GNU extensions, @var{addr1},+@var{N} addressing
+@cindex GNU extensions, @var{addr1},~@var{N} addressing
+@value{SSED} also supports some special two-address forms; all these
+are GNU extensions:
+@table @code
+@item 0,/@var{regexp}/
+A line number of @code{0} can be used in an address specification like
+@code{0,/@var{regexp}/} so that @command{sed} will try to match
+@var{regexp} in the first input line too.  In other words,
+@code{0,/@var{regexp}/} is similar to @code{1,/@var{regexp}/},
+except that if @var{addr2} matches the very first line of input the
+@code{0,/@var{regexp}/} form will consider it to end the range, whereas
+the @code{1,/@var{regexp}/} form will match the beginning of its range and
+hence make the range span up to the @emph{second} occurrence of the
+regular expression.
+The following examples demonstrate the difference between starting
+with address 1 and 0:
+@example
+$ seq 10 | sed -n '1,/[0-9]/p'
+$ seq 10 | sed -n '0,/[0-9]/p'
+@end example
+@item @var{addr1},+@var{N}
+Matches @var{addr1} and the @var{N} lines following @var{addr1}.
+@example
+$ seq 10 | sed -n '6,+2p'
+@end example
+@var{addr1} can be a line number or a regular expression.
+@item @var{addr1},~@var{N}
+Matches @var{addr1} and the lines following @var{addr1}
+until the next line whose input line number is a multiple of @var{N}.
+The following command prints starting at line 6, until the next line which
+is a multiple of 4 (i.e. line 8):
+@example
+$ seq 10 | sed -n '6,~4p'
+@end example
+@var{addr1} can be a line number or a regular expression.
+@end table
+@node Zero Address
+@section Zero Address
+@cindex Zero Address
+As a @value{SSED} extension, @code{0} address can be used in two cases:
+@enumerate
+@item
+In a regex range addresses as @code{0,/@var{regexp}/}
+(@pxref{Zero Address Regex Range}).
+@item
+With the @code{r} command, inserting a file before the first line
+(@pxref{Adding a header to multiple files}).
+@end enumerate
+Note that these are the only places where the @code{0} address makes
+sense; Commands which are given the @code{0} address in any
+other way will give an error.
+@node sed regular expressions
+@chapter Regular Expressions: selecting text
+@menu
+* Regular Expressions Overview:: Overview of Regular expression in @command{sed}
+* BRE vs ERE::               Basic (BRE) and extended (ERE) regular expression
+                             syntax
+* BRE syntax::               Overview of basic regular expression syntax
+* ERE syntax::               Overview of extended regular expression syntax
+* Character Classes and Bracket Expressions::
+* regexp extensions::        Additional regular expression commands
+* Back-references and Subexpressions:: Back-references and Subexpressions
+* Escapes::                  Specifying special characters
+* Locale Considerations::    Multibyte characters and locale considerations
+@end menu
+@node Regular Expressions Overview
+@section Overview of regular expression in @command{sed}
+@c NOTE: Keep examples in the 'overview' section
+@c neutral in regards to BRE/ERE - to ease understanding.
+To know how to use @command{sed}, people should understand regular
+expressions (@dfn{regexp} for short).  A regular expression
+is a pattern that is matched against a
+subject string from left to right.  Most characters are
+@dfn{ordinary}: they stand for
+themselves in a pattern, and match the corresponding characters.
+Regular expressions in @command{sed} are specified between two
+slashes.
+The following command prints lines containing the string @samp{hello}:
+@example
+sed -n '/hello/p'
+@end example
+The above example is equivalent to this @command{grep} command:
+@example
+grep 'hello'
+@end example
+The power of regular expressions comes from the ability to include
+alternatives and repetitions in the pattern.  These are encoded in the
+pattern by the use of @dfn{special characters}, which do not stand for
+themselves but instead are interpreted in some special way.
+The character @code{^} (caret) in a regular expression matches the
+beginning of the line. The character @code{.} (dot) matches any single
+character. The following @command{sed} command matches and prints
+lines which start with the letter @samp{b}, followed by any single character,
+followed by the letter @samp{d}:
+@example
+$ printf "%s\n" abode bad bed bit bid byte body | sed -n '/^b.d/p'
+bad
+bed
+bid
+body
+@end example
+The following sections explain the meaning and usage of special
+characters in regular expressions.
+@node BRE vs ERE
+@section Basic (BRE) and extended (ERE) regular expression
+Basic and extended regular expressions are two variations on the
+syntax of the specified pattern. Basic Regular Expression (BRE) syntax is the
+default in @command{sed} (and similarly in @command{grep}).
+Use the POSIX-specified @option{-E} option (@option{-r},
+@option{--regexp-extended}) to enable Extended Regular Expression (ERE) syntax.
+In @value{SSED}, the only difference between basic and extended regular
+expressions is in the behavior of a few special characters: @samp{?},
+@samp{+}, parentheses, braces (@samp{@{@}}), and @samp{|}.
+With basic (BRE) syntax, these characters do not have special meaning
+unless prefixed with a backslash (@samp{\}); While with extended (ERE) syntax
+it is reversed: these characters are special unless they are prefixed
+with backslash (@samp{\}).
+@multitable @columnfractions .28 .36 .35
+@headitem Desired pattern
+@tab Basic (BRE) Syntax
+@tab Extended (ERE) Syntax
+@item literal @samp{+} (plus sign)
+@tab
+@exampleindent 0
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ echo 'a+b=c' > foo
+$ sed -n '/a+b/p' foo
+a+b=c
+@end example
+@codequotebacktick off
+@codequoteundirected off
+@tab
+@exampleindent 0
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ echo 'a+b=c' > foo
+$ sed -E -n '/a\+b/p' foo
+a+b=c
+@end example
+@codequotebacktick off
+@codequoteundirected off
+@item One or more @samp{a} characters followed by @samp{b}
+(plus sign as special meta-character)
+@tab
+@exampleindent 0
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ echo aab > foo
+$ sed -n '/a\+b/p' foo
+aab
+@end example
+@codequotebacktick off
+@codequoteundirected off
+@tab
+@exampleindent 0
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ echo aab > foo
+$ sed -E -n '/a+b/p' foo
+aab
+@end example
+@codequotebacktick off
+@codequoteundirected off
+@end multitable
+@node BRE syntax
+@section Overview of basic regular expression syntax
+Here is a brief description
+of regular expression syntax as used in @command{sed}.
+@table @code
+@item @var{char}
+A single ordinary character matches itself.
+@item *
+@cindex GNU extensions, to basic regular expressions
+Matches a sequence of zero or more instances of matches for the
+preceding regular expression, which must be an ordinary character, a
+special character preceded by @code{\}, a @code{.}, a grouped regexp
+(see below), or a bracket expression.  As a GNU extension, a
+postfixed regular expression can also be followed by @code{*}; for
+example, @code{a**} is equivalent to @code{a*}.  POSIX
+.1-2001 says that @code{*} stands for itself when it appears at
+the start of a regular expression or subexpression, but many
+non-GNU implementations do not support this and portable
+scripts should instead use @code{\*} in these contexts.
+@item .
+Matches any character, including newline.
+@item ^
+Matches the null string at beginning of the pattern space, i.e. what
+appears after the circumflex must appear at the beginning of the
+pattern space.
+In most scripts, pattern space is initialized to the content of each
+line (@pxref{Execution Cycle, , How @code{sed} works}).  So, it is a
+useful simplification to think of @code{^#include} as matching only
+lines where @samp{#include} is the first thing on the line---if there is
+any preceding space, for example, the match fails.  This simplification is
+valid as long as the original content of pattern space is not modified,
+for example with an @code{s} command.
+@code{^} acts as a special character only at the beginning of the
+regular expression or subexpression (that is, after @code{\(} or
+@code{\|}).  Portable scripts should avoid @code{^} at the beginning of
+a subexpression, though, as POSIX allows implementations that
+treat @code{^} as an ordinary character in that context.
+@item $
+It is the same as @code{^}, but refers to end of pattern space.
+@code{$} also acts as a special character only at the end
+of the regular expression or subexpression (that is, before @code{\)}
+or @code{\|}), and its use at the end of a subexpression is not
+portable.
+@item [@var{list}]
+@itemx [^@var{list}]
+Matches any single character in @var{list}: for example,
+@code{[aeiou]} matches all vowels.  A list may include
+sequences like @code{@var{char1}-@var{char2}}, which
+matches any character between (inclusive) @var{char1}
+and @var{char2}.
+@xref{Character Classes and Bracket Expressions}.
+@item \+
+@cindex GNU extensions, to basic regular expressions
+As @code{*}, but matches one or more.  It is a GNU extension.
+@item \?
+@cindex GNU extensions, to basic regular expressions
+As @code{*}, but only matches zero or one.  It is a GNU extension.
+@item \@{@var{i}\@}
+As @code{*}, but matches exactly @var{i} sequences (@var{i} is a
+decimal integer; for portability, keep it between 0 and 255
+inclusive).
+@item \@{@var{i},@var{j}\@}
+Matches between @var{i} and @var{j}, inclusive, sequences.
+@item \@{@var{i},\@}
+Matches more than or equal to @var{i} sequences.
+@item \(@var{regexp}\)
+Groups the inner @var{regexp} as a whole, this is used to:
+@itemize @bullet
+@item
+@cindex GNU extensions, to basic regular expressions
+Apply postfix operators, like @code{\(abcd\)*}:
+this will search for zero or more whole sequences
+of @samp{abcd}, while @code{abcd*} would search
+for @samp{abc} followed by zero or more occurrences
+of @samp{d}.  Note that support for @code{\(abcd\)*} is
+required by POSIX 1003.1-2001, but many non-GNU
+implementations do not support it and hence it is not universally
+portable.
+@item
+Use back references (see below).
+@end itemize
+@item @var{regexp1}\|@var{regexp2}
+@cindex GNU extensions, to basic regular expressions
+Matches either @var{regexp1} or @var{regexp2}.  Use
+parentheses to use complex alternative regular expressions.
+The matching process tries each alternative in turn, from
+left to right, and the first one that succeeds is used.
+It is a GNU extension.
+@item @var{regexp1}@var{regexp2}
+Matches the concatenation of @var{regexp1} and @var{regexp2}.
+Concatenation binds more tightly than @code{\|}, @code{^}, and
+@code{$}, but less tightly than the other regular expression
+operators.
+@item \@var{digit}
+Matches the @var{digit}-th @code{\(@dots{}\)} parenthesized
+subexpression in the regular expression.  This is called a @dfn{back
+reference}.  Subexpressions are implicitly numbered by counting
+occurrences of @code{\(} left-to-right.
+@item \n
+Matches the newline character.
+@item \@var{char}
+Matches @var{char}, where @var{char} is one of @code{$},
+@code{*}, @code{.}, @code{[}, @code{\}, or @code{^}.
+Note that the only C-like
+backslash sequences that you can portably assume to be
+interpreted are @code{\n} and @code{\\}; in particular
+@code{\t} is not portable, and matches a @samp{t} under most
+implementations of @command{sed}, rather than a tab character.
+@end table
+@cindex Greedy regular expression matching
+Note that the regular expression matcher is greedy, i.e., matches
+are attempted from left to right and, if two or more matches are
+possible starting at the same character, it selects the longest.
+@noindent
+Examples:
+@table @samp
+@item abcdef
+Matches @samp{abcdef}.
+@item a*b
+Matches zero or more @samp{a}s followed by a single
+@samp{b}.  For example, @samp{b} or @samp{aaaaab}.
+@item a\?b
+Matches @samp{b} or @samp{ab}.
+@item a\+b\+
+Matches one or more @samp{a}s followed by one or more
+@samp{b}s: @samp{ab} is the shortest possible match, but
+other examples are @samp{aaaab} or @samp{abbbbb} or
+@samp{aaaaaabbbbbbb}.
+@item .*
+@itemx .\+
+These two both match all the characters in a string;
+however, the first matches every string (including the empty
+string), while the second matches only strings containing
+at least one character.
+@item ^main.*(.*)
+This matches a string starting with @samp{main},
+followed by an opening and closing
+parenthesis.  The @samp{n}, @samp{(} and @samp{)} need not
+be adjacent.
+@item ^#
+This matches a string beginning with @samp{#}.
+@item \\$
+This matches a string ending with a single backslash.  The
+regexp contains two backslashes for escaping.
+@item \$
+Instead, this matches a string consisting of a single dollar sign,
+because it is escaped.
+@item [a-zA-Z0-9]
+In the C locale, this matches any ASCII letters or digits.
+@item [^ @kbd{@key{TAB}}]\+
+(Here @kbd{@key{TAB}} stands for a single tab character.)
+This matches a string of one or more
+characters, none of which is a space or a tab.
+Usually this means a word.
+@item ^\(.*\)\n\1$
+This matches a string consisting of two equal substrings separated by
+a newline.
+@item .\@{9\@}A$
+This matches nine characters followed by an @samp{A} at the end of a line.
+@item ^.\@{15\@}A
+This matches the start of a string that contains 16 characters,
+the last of which is an @samp{A}.
+@end table
+@node ERE syntax
+@section Overview of extended regular expression syntax
+@cindex Extended regular expressions, syntax
+The only difference between basic and extended regular expressions is in
+the behavior of a few characters: @samp{?}, @samp{+}, parentheses,
+braces (@samp{@{@}}), and @samp{|}.  While basic regular expressions
+require these to be escaped if you want them to behave as special
+characters, when using extended regular expressions you must escape
+them if you want them @emph{to match a literal character}.  @samp{|}
+is special here because @samp{\|} is a GNU extension -- standard
+basic regular expressions do not provide its functionality.
+@noindent
+Examples:
+@table @code
+@item abc?
+becomes @samp{abc\?} when using extended regular expressions.  It matches
+the literal string @samp{abc?}.
+@item c\+
+becomes @samp{c+} when using extended regular expressions.  It matches
+one or more @samp{c}s.
+@item a\@{3,\@}
+becomes @samp{a@{3,@}} when using extended regular expressions.  It matches
+three or more @samp{a}s.
+@item \(abc\)\@{2,3\@}
+becomes @samp{(abc)@{2,3@}} when using extended regular expressions.  It
+matches either @samp{abcabc} or @samp{abcabcabc}.
+@item \(abc*\)\1
+becomes @samp{(abc*)\1} when using extended regular expressions.
+Backreferences must still be escaped when using extended regular
+expressions.
+@item a\|b
+becomes @samp{a|b} when using extended regular expressions.  It matches
+@samp{a} or @samp{b}.
+@end table
+@node Character Classes and Bracket Expressions
+@section Character Classes and Bracket Expressions
+@c The 'character class' section is shamelessly copied from grep's manual.
+@cindex bracket expression
+@cindex character class
+A @dfn{bracket expression} is a list of characters enclosed by @samp{[} and
+@samp{]}.
+It matches any single character in that list;
+if the first character of the list is the caret @samp{^},
+then it matches any character @strong{not} in the list.
+For example, the following command replaces the strings
+@samp{gray} or @samp{grey} with @samp{blue}:
+@example
+sed  's/gr[ae]y/blue/'
+@end example
+@c TODO: fix 'ref' to look good in both HTML and PDF
+Bracket expressions can be used in both
+@ref{BRE syntax,,basic} and @ref{ERE syntax,,extended}
+regular expressions (that is, with or without the @option{-E}/@option{-r}
+options).
+@cindex range expression
+Within a bracket expression, a @dfn{range expression} consists of two
+characters separated by a hyphen.
+It matches any single character that
+sorts between the two characters, inclusive.
+In the default C locale, the sorting sequence is the native character
+order; for example, @samp{[a-d]} is equivalent to @samp{[abcd]}.
+Finally, certain named classes of characters are predefined within
+bracket expressions, as follows.
+These named classes must be used @emph{inside} brackets
+themselves. Correct usage:
+@example
+$ echo 1 | sed 's/[[:digit:]]/X/'
+X
+@end example
+Incorrect usage is rejected by newer @command{sed} versions.
+Older versions accepted it but treated it as a single bracket expression
+(which is equivalent to @samp{[dgit:]},
+that is, only the characters @var{d/g/i/t/:}):
+@example
+# current GNU sed versions - incorrect usage rejected
+$ echo 1 | sed 's/[:digit:]/X/'
+sed: character class syntax is [[:space:]], not [:space:]
+# older GNU sed versions
+$ echo 1 | sed 's/[:digit:]/X/'
+@end example
+@cindex classes of characters
+@cindex character classes
+@cindex named character classes
+@table @samp
+@item [:alnum:]
+@opindex alnum @r{character class}
+@cindex alphanumeric characters
+Alphanumeric characters:
+@samp{[:alpha:]} and @samp{[:digit:]}; in the @samp{C} locale and ASCII
+character encoding, this is the same as @samp{[0-9A-Za-z]}.
+@item [:alpha:]
+@opindex alpha @r{character class}
+@cindex alphabetic characters
+Alphabetic characters:
+@samp{[:lower:]} and @samp{[:upper:]}; in the @samp{C} locale and ASCII
+character encoding, this is the same as @samp{[A-Za-z]}.
+@item [:blank:]
+@opindex blank @r{character class}
+@cindex blank characters
+Blank characters:
+space and tab.
+@item [:cntrl:]
+@opindex cntrl @r{character class}
+@cindex control characters
+Control characters.
+In ASCII, these characters have octal codes 000
+through 037, and 177 (DEL).
+In other character sets, these are
+the equivalent characters, if any.
+@item [:digit:]
+@opindex digit @r{character class}
+@cindex digit characters
+@cindex numeric characters
+Digits: @code{0 1 2 3 4 5 6 7 8 9}.
+@item [:graph:]
+@opindex graph @r{character class}
+@cindex graphic characters
+Graphical characters:
+@samp{[:alnum:]} and @samp{[:punct:]}.
+@item [:lower:]
+@opindex lower @r{character class}
+@cindex lower-case letters
+Lower-case letters; in the @samp{C} locale and ASCII character
+encoding, this is
+@code{a b c d e f g h i j k l m n o p q r s t u v w x y z}.
+@item [:print:]
+@opindex print @r{character class}
+@cindex printable characters
+Printable characters:
+@samp{[:alnum:]}, @samp{[:punct:]}, and space.
+@item [:punct:]
+@opindex punct @r{character class}
+@cindex punctuation characters
+Punctuation characters; in the @samp{C} locale and ASCII character
+encoding, this is
+@code{!@: " # $ % & ' ( ) * + , - .@: / : ; < = > ?@: @@ [ \ ] ^ _ ` @{ | @} ~}.
+@item [:space:]
+@opindex space @r{character class}
+@cindex space characters
+@cindex whitespace characters
+Space characters: in the @samp{C} locale, this is
+tab, newline, vertical tab, form feed, carriage return, and space.
+@item [:upper:]
+@opindex upper @r{character class}
+@cindex upper-case letters
+Upper-case letters: in the @samp{C} locale and ASCII character
+encoding, this is
+@code{A B C D E F G H I J K L M N O P Q R S T U V W X Y Z}.
+@item [:xdigit:]
+@opindex xdigit @r{character class}
+@cindex xdigit class
+@cindex hexadecimal digits
+Hexadecimal digits:
+@code{0 1 2 3 4 5 6 7 8 9 A B C D E F a b c d e f}.
+@end table
+Note that the brackets in these class names are
+part of the symbolic names, and must be included in addition to
+the brackets delimiting the bracket expression.
+Most meta-characters lose their special meaning inside bracket expressions:
+@table @samp
+@item ]
+ends the bracket expression if it's not the first list item.
+So, if you want to make the @samp{]} character a list item,
+you must put it first.
+@item -
+represents the range if it's not first or last in a list or the ending point
+of a range.
+@item ^
+represents the characters not in the list.
+If you want to make the @samp{^}
+character a list item, place it anywhere but first.
+@end table
+TODO: incorporate this paragraph (copied verbatim from BRE section).
+@cindex @code{POSIXLY_CORRECT} behavior, bracket expressions
+The characters @code{$}, @code{*}, @code{.}, @code{[}, and @code{\}
+are normally not special within @var{list}.  For example, @code{[\*]}
+matches either @samp{\} or @samp{*}, because the @code{\} is not
+special here.  However, strings like @code{[.ch.]}, @code{[=a=]}, and
+@code{[:space:]} are special within @var{list} and represent collating
+symbols, equivalence classes, and character classes, respectively, and
+@code{[} is therefore special within @var{list} when it is followed by
+@code{.}, @code{=}, or @code{:}.  Also, when not in
+@env{POSIXLY_CORRECT} mode, special escapes like @code{\n} and
+@code{\t} are recognized within @var{list}.  @xref{Escapes}.
+@c ********
+@c TODO: improve explanation about collation classes and equivalence classes
+@c       perhaps dedicate a section to Locales ??
+@table @samp
+@item [.
+represents the open collating symbol.
+@item .]
+represents the close collating symbol.
+@item [=
+represents the open equivalence class.
+@item =]
+represents the close equivalence class.
+@item [:
+represents the open character class symbol, and should be followed by a
+valid character class name.
+@item :]
+represents the close character class symbol.
+@end table
+@node regexp extensions
+@section regular expression extensions
+The following sequences have special meaning inside regular expressions
+(used in @ref{Regexp Addresses,,addresses} and the @code{s} command).
+These can be used in both
+@ref{BRE syntax,,basic} and @ref{ERE syntax,,extended}
+regular expressions (that is, with or without the @option{-E}/@option{-r}
+options).
+@table @code
+@item \w
+Matches any ``word'' character.  A ``word'' character is any
+letter or digit or the underscore character.
+@example
+$ echo "abc %-= def." | sed 's/\w/X/g'
+XXX %-= XXX.
+@end example
+@item \W
+Matches any ``non-word'' character.
+@example
+$ echo "abc %-= def." | sed 's/\W/X/g'
+abcXXXXXdefX
+@end example
+@item \b
+Matches a word boundary; that is it matches if the character
+to the left is a ``word'' character and the character to the
+right is a ``non-word'' character, or vice-versa.
+@example
+$ echo "abc %-= def." | sed 's/\b/X/g'
+XabcX %-= XdefX.
+@end example
+@item \B
+Matches everywhere but on a word boundary; that is it matches
+if the character to the left and the character to the right
+are either both ``word'' characters or both ``non-word''
+characters.
+@example
+$ echo "abc %-= def." | sed 's/\B/X/g'
+aXbXc X%X-X=X dXeXf.X
+@end example
+@item \s
+Matches whitespace characters (spaces and tabs).
+Newlines embedded in the pattern/hold spaces will also match:
+@example
+$ echo "abc %-= def." | sed 's/\s/X/g'
+abcX%-=Xdef.
+@end example
+@item \S
+Matches non-whitespace characters.
+@example
+$ echo "abc %-= def." | sed 's/\S/X/g'
+XXX XXX XXXX
+@end example
+@item \<
+Matches the beginning of a word.
+@example
+$ echo "abc %-= def." | sed 's/\</X/g'
+Xabc %-= Xdef.
+@end example
+@item \>
+Matches the end of a word.
+@example
+$ echo "abc %-= def." | sed 's/\>/X/g'
+abcX %-= defX.
+@end example
+@item \`
+Matches only at the start of pattern space.  This is different
+from @code{^} in multi-line mode.
+Compare the following two examples:
+@example
+$ printf "a\nb\nc\n" | sed 'N;N;s/^/X/gm'
+Xa
+Xb
+Xc
+$ printf "a\nb\nc\n" | sed 'N;N;s/\`/X/gm'
+Xa
+b
+c
+@end example
+@item \'
+Matches only at the end of pattern space.  This is different
+from @code{$} in multi-line mode.
+@end table
+@node Back-references and Subexpressions
+@section Back-references and Subexpressions
+@cindex subexpression
+@cindex back-reference
+@dfn{back-references} are regular expression commands which refer to a
+previous part of the matched regular expression.  Back-references are
+specified with backslash and a single digit (e.g. @samp{\1}).  The
+part of the regular expression they refer to is called a
+@dfn{subexpression}, and is designated with parentheses.
+Back-references and subexpressions are used in two cases: in the
+regular expression search pattern, and in the @var{replacement} part
+of the @command{s} command (@pxref{Regexp Addresses,,Regular
+Expression Addresses} and @ref{The "s" Command}).
+In a regular expression pattern, back-references are used to match
+the same content as a previously matched subexpression.  In the
+following example, the subexpression is @samp{.} - any single
+character (being surrounded by parentheses makes it a
+subexpression). The back-reference @samp{\1} asks to match the same
+content (same character) as the sub-expression.
+The command below matches words starting with any character,
+followed by the letter @samp{o}, followed by the same character as the
+first.
+@example
+$ sed -E -n '/^(.)o\1$/p' /usr/share/dict/words
+bob
+mom
+non
+pop
+sos
+tot
+wow
+@end example
+Multiple subexpressions are automatically numbered from
+left-to-right. This command searches for 6-letter
+palindromes (the first three letters are 3 subexpressions,
+followed by 3 back-references in reverse order):
+@example
+$ sed -E -n '/^(.)(.)(.)\3\2\1$/p' /usr/share/dict/words
+redder
+@end example
+In the @command{s} command, back-references can be
+used in the @var{replacement} part to refer back to subexpressions in
+the @var{regexp} part.
+The following example uses two subexpressions in the regular
+expression to match two space-separated words. The back-references in
+the @var{replacement} part prints the words in a different order:
+@example
+$ echo "James Bond" | sed -E 's/(.*) (.*)/The name is \2, \1 \2./'
+The name is Bond, James Bond.
+@end example
+When used with alternation, if the group does not participate in the
+match then the back-reference makes the whole match fail.  For
+example, @samp{a(.)|b\1} will not match @samp{ba}.  When multiple
+regular expressions are given with @option{-e} or from a file
+(@samp{-f @var{file}}), back-references are local to each expression.
 @node Escapes
 @section @acronym{GNU} Extensions for Escapes in Regular Expressions
 @cindex @acronym{GNU} extensions, special escapes
+@section Escape Sequences - specifying special characters
+@cindex GNU extensions, special escapes
 Until this chapter, we have only encountered escapes of the form
 @samp{\^}, which tell @command{sed} not to interpret the circumflex
 …
 @cindex @code{POSIXLY_CORRECT} behavior, escapes
 This chapter introduces another kind of escape@footnote{All
 the escapes introduced here are @acronym{GNU}
+the escapes introduced here are GNU
 extensions, with the exception of @code{\n}.  In basic regular
 expression mode, setting @code{POSIXLY_CORRECT} disables them inside
 …
 @item \o@var{xxx}
-@ifset PERL
-@item \@var{xxx}
-@end ifset
 Produces or matches a character whose octal @sc{ascii} value is @var{xxx}.
-@ifset PERL
-The syntax without the @code{o} is active in Perl mode, while the one
-with the @code{o} is active in the normal or extended @sc{posix} regular
-expression modes.
-@end ifset
 @item \x@var{xx}
 …
 the existing ``word boundary'' meaning.
+Other escapes match a particular character class and are valid only in
+regular expressions:
+@subsection Escaping Precedence
+@value{SSED} processes escape sequences @emph{before} passing
+the text onto the regular-expression matching of the @command{s///} command
+and Address matching. Thus the following two commands are equivalent
+(@samp{0x5e} is the hexadecimal @sc{ascii} value of the character @samp{^}):
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ echo 'a^c' | sed 's/^/b/'
+ba^c
+$ echo 'a^c' | sed 's/\x5e/b/'
+ba^c
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+As are the following (@samp{0x5b},@samp{0x5d} are the hexadecimal
+@sc{ascii} values of @samp{[},@samp{]}, respectively):
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ echo abc | sed 's/[a]/x/'
+Xbc
+$ echo abc | sed 's/\x5ba\x5d/x/'
+Xbc
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+However it is recommended to avoid such special characters
+due to unexpected edge-cases. For example, the following
+are not equivalent:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ echo 'a^c' | sed 's/\^/b/'
+abc
+$ echo 'a^c' | sed 's/\\\x5e/b/'
+a^c
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@c also: this fails in different places:
+@c   $ sed 's/[//'
+@c   sed: -e expression #1, char 5: unterminated `s' command
+@c   $ sed 's/\x5b//'
+@c   sed: -e expression #1, char 8: Invalid regular expression
+@c
+@c which is OK but confusing to explain why (the first
+@c fails in compile.c:snarf_char_class while the second
+@c is passed to the regex engine and then fails).
+@node Locale Considerations
+@section Multibyte characters and Locale Considerations
+@value{SSED} processes valid multibyte characters in multibyte locales
+(e.g. @code{UTF-8}).  @footnote{Some regexp edge-cases depends on the
+operating system and libc implementation. The examples shown are known
+to work as-expected on GNU/Linux systems using glibc.}
+@noindent The following example uses the Greek letter Capital Sigma
+(@value{ucsigma},
+Unicode code point @code{0x03A3}). In a @code{UTF-8} locale,
+@command{sed} correctly processes the Sigma as one character despite
+it being 2 octets (bytes):
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ locale | grep LANG
+LANG=en_US.UTF-8
+$ printf 'a\u03A3b'
+a@value{ucsigma}b
+$ printf 'a\u03A3b' | sed 's/./X/g'
+XXX
+$ printf 'a\u03A3b' | od -tx1 -An
+ce a3 62
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@noindent
+To force @command{sed} to process octets separately, use the @code{C} locale
+(also known as the @code{POSIX} locale):
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ printf 'a\u03A3b' | LC_ALL=C sed 's/./X/g'
+XXXX
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@subsection Invalid multibyte characters
+@command{sed}'s regular expressions @emph{do not} match
+invalid multibyte sequences in a multibyte locale.
+@noindent
+In the following examples, the ascii value @code{0xCE} is
+an incomplete multibyte character (shown here as @value{unicodeFFFD}).
+The regular expression @samp{.} does not match it:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ printf 'a\xCEb\n'
+a@value{unicodeFFFD}e
+$ printf 'a\xCEb\n' | sed 's/./X/g'
+X@value{unicodeFFFD}X
+$ printf 'a\xCEc\n' | sed 's/./X/g' | od -tx1c -An
+  ce  58  0a
+   X      X   \n
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@noindent Similarly, the 'catch-all' regular expression @samp{.*} does not
+match the entire line:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ printf 'a\xCEc\n' | sed 's/.*//' | od -tx1c -An
+  ce  63  0a
+       c  \n
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@noindent
+@value{SSED} offers the special @command{z} command to clear the
+current pattern space regardless of invalid multibyte characters
+(i.e. it works like @code{s/.*//} but also removes invalid multibyte
+characters):
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ printf 'a\xCEc\n' | sed 'z' | od -tx1c -An
+a
+   \n
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@noindent Alternatively, force the @code{C} locale to process
+each octet separately (every octet is a valid character in the @code{C}
+locale):
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ printf 'a\xCEc\n' | LC_ALL=C sed 's/.*//' | od -tx1c -An
+a
+  \n
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@command{sed}'s inability to process invalid multibyte characters
+can be used to detect such invalid sequences in a file.
+In the following examples, the @code{\xCE\xCE} is an invalid
+multibyte sequence, while @code{\xCE\A3} is a valid multibyte sequence
+(of the Greek Sigma character).
+@noindent
+The following @command{sed} program removes all valid
+characters using @code{s/.//g}.  Any content left in the pattern space
+(the invalid characters) are added to the hold space using the
+@code{H} command. On the last line (@code{$}), the hold space is retrieved
+(@code{x}), newlines are removed (@code{s/\n//g}), and any remaining
+octets are printed unambiguously (@code{l}).  Thus, any invalid
+multibyte sequences are printed as octal values:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ printf 'ab\nc\n\xCE\xCEde\n\xCE\xA3f\n' > invalid.txt
+$ cat invalid.txt
+ab
+c
+@value{unicodeFFFD}@value{unicodeFFFD}de
+@value{ucsigma}f
+$ sed -n 's/.//g ; H ; $@{x;s/\n//g;l@}' invalid.txt
+\316\316$
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@noindent With a few more commands, @command{sed} can print
+the exact line number corresponding to each invalid characters (line 3).
+These characters can then be removed by forcing the @code{C} locale
+and using octal escape sequences:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ sed -n 's/.//g;=;l' invalid.txt | paste - -  | awk '$2!="$"'
+       \316\316$
+$ LC_ALL=C sed '3s/\o316\o316//' invalid.txt > fixed.txt
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@subsection Upper/Lower case conversion
+@value{SSED}'s substitute command (@code{s}) supports upper/lower
+case conversions using @code{\U},@code{\L} codes.
+These conversions support multibyte characters:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ printf 'ABC\u03a3\n'
+ABC@value{ucsigma}
+$ printf 'ABC\u03a3\n' | sed 's/.*/\L&/'
+abc@value{lcsigma}
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@noindent
+@xref{The "s" Command}.
+@subsection Multibyte regexp character classes
+@c TODO: fix following paragraphs (copied verbatim from 'bracket
+@c expression' section).
+In other locales, the sorting sequence is not specified, and
+@samp{[a-d]} might be equivalent to @samp{[abcd]} or to
+@samp{[aBbCcDd]}, or it might fail to match any character, or the set of
+characters that it matches might even be erratic.
+To obtain the traditional interpretation
+of bracket expressions, you can use the @samp{C} locale by setting the
+@env{LC_ALL} environment variable to the value @samp{C}.
+@example
+# TODO: is there any real-world system/locale where 'A'
+#       is replaced by '-' ?
+$ echo A | sed 's/[a-z]/-/'
+A
+@end example
+Their interpretation depends on the @env{LC_CTYPE} locale;
+for example, @samp{[[:alnum:]]} means the character class of numbers and letters
+in the current locale.
+TODO: show example of collation
+@codequoteundirected on
+@codequotebacktick on
+@example
+# TODO: this works on glibc systems, not on musl-libc/freebsd/macosx.
+$ printf 'clichÃ©\n' | LC_ALL=fr_FR.utf8 sed 's/[[=e=]]/X/g'
+clichX
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@node advanced sed
+@chapter Advanced @command{sed}: cycles and buffers
+@menu
+* Execution Cycle::          How @command{sed} works
+* Hold and Pattern Buffers::
+* Multiline techniques::     Using D,G,H,N,P to process multiple lines
+* Branching and flow control::
+@end menu
+@node Execution Cycle
+@section How @command{sed} Works
+@cindex Buffer spaces, pattern and hold
+@cindex Spaces, pattern and hold
+@cindex Pattern space, definition
+@cindex Hold space, definition
+@command{sed} maintains two data buffers: the active @emph{pattern} space,
+and the auxiliary @emph{hold} space. Both are initially empty.
+@command{sed} operates by performing the following cycle on each
+line of input: first, @command{sed} reads one line from the input
+stream, removes any trailing newline, and places it in the pattern space.
+Then commands are executed; each command can have an address associated
+to it: addresses are a kind of condition code, and a command is only
+executed if the condition is verified before the command is to be
+executed.
+When the end of the script is reached, unless the @option{-n} option
+is in use, the contents of pattern space are printed out to the output
+stream, adding back the trailing newline if it was removed.@footnote{Actually,
+if @command{sed} prints a line without the terminating newline, it will
+nevertheless print the missing newline as soon as more text is sent to
+the same output stream, which gives the ``least expected surprise''
+even though it does not make commands like @samp{sed -n p} exactly
+identical to @command{cat}.} Then the next cycle starts for the next
+input line.
+Unless special commands (like @samp{D}) are used, the pattern space is
+deleted between two cycles. The hold space, on the other hand, keeps
+its data between cycles (see commands @samp{h}, @samp{H}, @samp{x},
+@samp{g}, @samp{G} to move data between both buffers).
+@node Hold and Pattern Buffers
+@section Hold and Pattern Buffers
+TODO
+@node Multiline techniques
+@section Multiline techniques - using D,G,H,N,P to process multiple lines
+Multiple lines can be processed as one buffer using the
+@code{D},@code{G},@code{H},@code{N},@code{P}. They are similar to
+their lowercase counterparts (@code{d},@code{g},
+@code{h},@code{n},@code{p}), except that these commands append or
+subtract data while respecting embedded newlines - allowing adding and
+removing lines from the pattern and hold spaces.
+They operate as follows:
 @table @code
+@item \w
+Matches any ``word'' character.  A ``word'' character is any
+letter or digit or the underscore character.
+@item \W
+Matches any ``non-word'' character.
+@item \b
+Matches a word boundary; that is it matches if the character
+to the left is a ``word'' character and the character to the
+right is a ``non-word'' character, or vice-versa.
+@item \B
+Matches everywhere but on a word boundary; that is it matches
+if the character to the left and the character to the right
+are either both ``word'' characters or both ``non-word''
+characters.
+@item \`
+Matches only at the start of pattern space.  This is different
+from @code{^} in multi-line mode.
+@item \'
+Matches only at the end of pattern space.  This is different
+from @code{$} in multi-line mode.
+@ifset PERL
+@item \G
+Match only at the start of pattern space or, when doing a global
+substitution using the @code{s///g} command and option, at
+the end-of-match position of the prior match.  For example,
+@samp{s/\Ga/Z/g} will change an initial run of @code{a}s to
+a run of @code{Z}s
+@end ifset
+@item D
+@emph{deletes} line from the pattern space until the first newline,
+and restarts the cycle.
+@item G
+@emph{appends} line from the hold space to the pattern space, with a
+newline before it.
+@item H
+@emph{appends} line from the pattern space to the hold space, with a
+newline before it.
+@item N
+@emph{appends} line from the input file to the pattern space.
+@item P
+@emph{prints} line from the pattern space until the first newline.
 @end table
+The following example illustrates the operation of @code{N} and
+@code{D} commands:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 6 | sed -n 'N;l;D'
+\n2$
+\n3$
+\n4$
+\n5$
+\n6$
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@enumerate
+@item
+@command{sed} starts by reading the first line into the pattern space
+(i.e. @samp{1}).
+@item
+At the beginning of every cycle, the @code{N}
+command appends a newline and the next line to the pattern space
+(i.e. @samp{1}, @samp{\n}, @samp{2} in the first cycle).
+@item
+The @code{l} command prints the content of the pattern space
+unambiguously.
+@item
+The @code{D} command then removes the content of pattern
+space up to the first newline (leaving @samp{2} at the end of
+the first cycle).
+@item
+At the next cycle the @code{N} command appends a
+newline and the next input line to the pattern space
+(e.g. @samp{2}, @samp{\n}, @samp{3}).
+@end enumerate
+@cindex processing paragraphs
+@cindex paragraphs, processing
+A common technique to process blocks of text such as paragraphs
+(instead of line-by-line) is using the following construct:
+@codequoteundirected on
+@codequotebacktick on
+@example
+sed '/./@{H;$!d@} ; x ; s/REGEXP/REPLACEMENT/'
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@enumerate
+@item
+The first expression, @code{/./@{H;$!d@}} operates on all non-empty lines,
+and adds the current line (in the pattern space) to the hold space.
+On all lines except the last, the pattern space is deleted and the cycle is
+restarted.
+@item
+The other expressions @code{x} and @code{s} are executed only on empty
+lines (i.e. paragraph separators). The @code{x} command fetches the
+accumulated lines from the hold space back to the pattern space. The
+@code{s///} command then operates on all the text in the paragraph
+(including the embedded newlines).
+@end enumerate
+The following example demonstrates this technique:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ cat input.txt
+a a a aa aaa
+aaaa aaaa aa
+aaaa aaa aaa
+bbbb bbb bbb
+bb bb bbb bb
+bbbbbbbb bbb
+ccc ccc cccc
+cccc ccccc c
+cc cc cc cc
+$ sed '/./@{H;$!d@} ; x ; s/^/\nSTART-->/ ; s/$/\n<--END/' input.txt
+START-->
+a a a aa aaa
+aaaa aaaa aa
+aaaa aaa aaa
+<--END
+START-->
+bbbb bbb bbb
+bb bb bbb bb
+bbbbbbbb bbb
+<--END
+START-->
+ccc ccc cccc
+cccc ccccc c
+cc cc cc cc
+<--END
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+For more annotated examples, @pxref{Text search across multiple lines}
+and @ref{Line length adjustment}.
+@node Branching and flow control
+@section Branching and Flow Control
+The branching commands @code{b}, @code{t}, and @code{T} enable
+changing the flow of @command{sed} programs.
+By default, @command{sed} reads an input line into the pattern buffer,
+then continues to processes all commands in order.
+Commands without addresses affect all lines.
+Commands with addresses affect only matching lines.
+@xref{Execution Cycle} and @ref{Addresses overview}.
+@command{sed} does not support a typical @code{if/then} construct.
+Instead, some commands can be used as conditionals or to change the
+default flow control:
+@table @code
+@item d
+delete (clears) the current pattern space,
+and restart the program cycle without processing the rest of the commands
+and without printing the pattern space.
+@item D
+delete the contents of the pattern space @emph{up to the first newline},
+and restart the program cycle without processing the rest of
+the commands and without printing the pattern space.
+@item [addr]X
+@itemx [addr]@{ X ; X ; X @}
+@item /regexp/X
+@item /regexp/@{ X ; X ; X @}
+Addresses and regular expressions can be used as an @code{if/then}
+conditional: If @var{[addr]} matches the current pattern space,
+execute the command(s).
+For example: The command @code{/^#/d} means:
+@emph{if} the current pattern matches the regular expression @code{^#} (a line
+starting with a hash), @emph{then} execute the @code{d} command:
+delete the line without printing it, and restart the program cycle
+immediately.
+@item b
+branch unconditionally (that is: always jump to a label, skipping
+or repeating other commands, without restarting a new cycle). Combined
+with an address, the branch can be conditionally executed on matched
+lines.
+@item t
+branch conditionally (that is: jump to a label) @emph{only if} a
+@code{s///} command has succeeded since the last input line was read
+or another conditional branch was taken.
+@item T
+similar but opposite to the @code{t} command: branch only if
+there has been @emph{no} successful substitutions since the last
+input line was read.
+@end table
+The following two @command{sed} programs are equivalent.  The first
+(contrived) example uses the @code{b} command to skip the @code{s///}
+command on lines containing @samp{1}.  The second example uses an
+address with negation (@samp{!})  to perform substitution only on
+desired lines.  The @code{y///} command is still executed on all
+lines:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ printf '%s\n' a1 a2 a3 | sed -E '/1/bx ; s/a/z/ ; :x ; y/123/456/'
+a4
+z5
+z6
+$ printf '%s\n' a1 a2 a3 | sed -E '/1/!s/a/z/ ; y/123/456/'
+a4
+z5
+z6
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@subsection Branching and Cycles
+@cindex labels
+@cindex omitting labels
+@cindex cycle, restarting
+@cindex restarting a cycle
+The @code{b},@code{t} and @code{T} commands can be followed by a label
+(typically a single letter). Labels are defined with a colon followed by
+one or more letters (e.g. @samp{:x}). If the label is omitted the
+branch commands restart the cycle.  Note the difference between
+branching to a label and restarting the cycle: when a cycle is
+restarted, @command{sed} first prints the current content of the
+pattern space, then reads the next input line into the pattern space;
+Jumping to a label (even if it is at the beginning of the program)
+does not print the pattern space and does not read the next input line.
+The following program is a no-op. The @code{b} command (the only command
+in the program) does not have a label, and thus simply restarts the cycle.
+On each cycle, the pattern space is printed and the next input line is read:
+@example
+@group
+$ seq 3 | sed b
+@end group
+@end example
+@cindex infinite loop, branching
+@cindex branching, infinite loop
+The following example is an infinite-loop - it doesn't terminate and
+doesn't print anything. The @code{b} command jumps to the @samp{x}
+label, and a new cycle is never started:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 3 | sed ':x ; bx'
+# The above command requires gnu sed (which supports additional
+# commands following a label, without a newline). A portable equivalent:
+#     sed -e ':x' -e bx
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@cindex branching and n, N
+@cindex n, and branching
+@cindex N, and branching
+Branching is often complemented with the @code{n} or @code{N} commands:
+both commands read the next input line into the pattern space without waiting
+for the cycle to restart. Before reading the next input line, @code{n}
+prints the current pattern space then empties it, while @code{N}
+appends a newline and the next input line to the pattern space.
+Consider the following two examples:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ seq 3 | sed ':x ; n ; bx'
+$ seq 3 | sed ':x ; N ; bx'
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@itemize
+@item
+Both examples do not inf-loop, despite never starting a new cycle.
+@item
+In the first example, the @code{n} commands first prints the content
+of the pattern space, empties the pattern space then reads the next
+input line.
+@item
+In the second example, the @code{N} commands appends the next input
+line to the pattern space (with a newline).  Lines are accumulated in
+the pattern space until there are no more input lines to read, then
+the @code{N} command terminates the @command{sed} program. When the
+program terminates, the end-of-cycle actions are performed, and the
+entire pattern space is printed.
+@item
+The second example requires @value{SSED},
+because it uses the non-POSIX-standard behavior of @code{N}.
+See the ``@code{N} command on the last line'' paragraph
+in @ref{Reporting Bugs}.
+@item
+To further examine the difference between the two examples,
+try the following commands:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+printf '%s\n' aa bb cc dd | sed ':x ; n ; = ; bx'
+printf '%s\n' aa bb cc dd | sed ':x ; N ; = ; bx'
+printf '%s\n' aa bb cc dd | sed ':x ; n ; s/\n/***/ ; bx'
+printf '%s\n' aa bb cc dd | sed ':x ; N ; s/\n/***/ ; bx'
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@end itemize
+@subsection Branching example: joining lines
+@cindex joining lines with branching
+@cindex branching, joining lines
+@cindex quoted-printable lines, joining
+@cindex joining quoted-printable lines
+@cindex t, joining lines with
+@cindex b, joining lines with
+@cindex b, versus t
+@cindex t, versus b
+As a real-world example of using branching, consider the case of
+@uref{https://en.wikipedia.org/wiki/Quoted-printable,quoted-printable} files,
+typically used to encode email messages.
+In these files long lines are split and marked with a @dfn{soft line break}
+consisting of a single @samp{=} character at the end of the line:
+@example
+@group
+$ cat jaques.txt
+All the wor=
+ld's a stag=
+e,
+And all the=
+ men and wo=
+men merely =
+players:
+They have t=
+heir exits =
+and their e=
+ntrances;
+And one man=
+ in his tim=
+e plays man=
+y parts.
+@end group
+@end example
+The following program uses an address match @samp{/=$/} as a
+conditional: If the current pattern space ends with a @samp{=}, it
+reads the next input line using @code{N}, replaces all @samp{=}
+characters which are followed by a newline, and unconditionally
+branches (@code{b}) to the beginning of the program without restarting
+a new cycle. If the pattern space does not ends with @samp{=}, the
+default action is performed: the pattern space is printed and a new
+cycle is started:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ sed ':x ; /=$/ @{ N ; s/=\n//g ; bx @}' jaques.txt
+All the world's a stage,
+And all the men and women merely players:
+They have their exits and their entrances;
+And one man in his time plays many parts.
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+Here's an alternative program with a slightly different approach: On
+all lines except the last, @code{N} appends the line to the pattern
+space.  A substitution command then removes soft line breaks
+(@samp{=} at the end of a line, i.e. followed by a newline) by replacing
+them with an empty string.
+@emph{if} the substitution was successful (meaning the pattern space contained
+a line which should be joined), The conditional branch command @code{t} jumps
+to the beginning of the program without completing or restarting the cycle.
+If the substitution failed (meaning there were no soft line breaks),
+The @code{t} command will @emph{not} branch. Then, @code{P} will
+print the pattern space content until the first newline, and @code{D}
+will delete the pattern space content until the first new line.
+(To learn more about @code{N}, @code{P} and @code{D} commands
+@pxref{Multiline techniques}).
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ sed ':x ; $!N ; s/=\n// ; tx ; P ; D' jaques.txt
+All the world's a stage,
+And all the men and women merely players:
+They have their exits and their entrances;
+And one man in his time plays many parts.
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+For more line-joining examples @pxref{Joining lines}.
 @node Examples
 …
 @menu
+Useful one-liners:
+* Joining lines::
 Some exotic examples:
 * Centering lines::
 …
 * Print bash environment::
 * Reverse chars of lines::
+* Text search across multiple lines::
+* Line length adjustment::
+* Adding a header to multiple files::
 Emulating standard utilities:
 …
 @end menu
+@node Joining lines
+@section Joining lines
+This section uses @code{N}, @code{D} and @code{P} commands to process
+multiple lines, and the @code{b} and @code{t} commands for branching.
+@xref{Multiline techniques} and @ref{Branching and flow control}.
+Join specific lines (e.g. if lines 2 and 3 need to be joined):
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ cat lines.txt
+hello
+hel
+lo
+hello
+$ sed '2@{N;s/\n//;@}' lines.txt
+hello
+hello
+hello
+@end example
+@codequoteundirected off
+@codequotebacktick off
+Join backslash-continued lines:
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ cat 1.txt
+this \
+is \
+a \
+long \
+line
+and another \
+line
+$ sed -e ':x /\\$/ @{ N; s/\\\n//g ; bx @}'  1.txt
+this is a long line
+and another line
+#TODO: The above requires gnu sed.
+#      non-gnu seds need newlines after ':' and 'b'
+@end example
+@codequoteundirected off
+@codequotebacktick off
+Join lines that start with whitespace (e.g SMTP headers):
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ cat 2.txt
+Subject: Hello
+    World
+Content-Type: multipart/alternative;
+    boundary=94eb2c190cc6370f06054535da6a
+Date: Tue, 3 Jan 2017 19:41:16 +0000 (GMT)
+Authentication-Results: mx.gnu.org;
+       dkim=pass header.i=@@gnu.org;
+       spf=pass
+Message-ID: <abcdef@@gnu.org>
+From: John Doe <jdoe@@gnu.org>
+To: Jane Smith <jsmith@@gnu.org>
+$ sed -E ':a ; $!N ; s/\n\s+/ / ; ta ; P ; D' 2.txt
+Subject: Hello World
+Content-Type: multipart/alternative; boundary=94eb2c190cc6370f06054535da6a
+Date: Tue, 3 Jan 2017 19:41:16 +0000 (GMT)
+Authentication-Results: mx.gnu.org; dkim=pass header.i=@@gnu.org; spf=pass
+Message-ID: <abcdef@@gnu.org>
+From: John Doe <jdoe@@gnu.org>
+To: Jane Smith <jsmith@@gnu.org>
+# A portable (non-gnu) variation:
+#   sed -e :a -e '$!N;s/\n  */ /;ta' -e 'P;D'
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
 @node Centering lines
 @section Centering Lines
 …
 @group
 # del leading and trailing spaces
 y/@kbd{tab}/ /
+# delete leading and trailing spaces
+y/@kbd{@key{TAB}}/ /
 s/^ *//
 s/ *$//
 …
 @group
 # replace all leading 9s by _ (any other character except digits, could
+# replace all trailing 9s by _ (any other character except digits, could
 # be used)
 :d
 …
 # incr last digit only.  The first line adds a most-significant
 # digit of 1 if we have to add a digit.
+#
-# The @code{tn} commands are not necessary, but make the thing
-# faster
 @end group
 …
 seen a script converting the output of @command{date} into a @command{bc}
 program!
 The main body of this is the @command{sed} script, which remaps the name
 from lower to upper (or vice-versa) and even checks out
+from lower to upper (or vice-versa) and even checks out
 if the remapped name is the same as the original name.
 Note how the script is parameterized using shell
 …
 @group
 #! /bin/sh
 # rename files to lower/upper case...
+# rename files to lower/upper case...
+#
 # usage:
 #    move-to-lower *
 #    move-to-upper *
+# usage:
+#    move-to-lower *
+#    move-to-upper *
 # or
 #    move-to-lower -R .
 …
 help()
 @{
         cat << eof
+        cat << eof
 Usage: $0 [-n] [-r] [-h] files...
 @end group
 …
 while :
 do
     case "$1" in
+    case "$1" in
         -n) apply_cmd='cat' ;;
         -R) finder='find "$@@" -type f';;
 …
 esac
 @end group
 eval $finder | sed -n '
 …
 @group
 # check if converted file name is equal to original file name,
 # if it is, do not print nothing
+# if it is, do not print anything
 /^.*\/\(.*\)\n\1/b
+@end group
+@group
+# escape special characters for the shell
+s/["$`\\]/\\&/g
 @end group
 …
 @c end---------------------------------------------
+@node Text search across multiple lines
+@section Text search across multiple lines
+This section uses @code{N} and @code{D} commands to search for
+consecutive words spanning multiple lines. @xref{Multiline techniques}.
+These examples deal with finding doubled occurrences of words in a document.
+Finding doubled words in a single line is easy using GNU @command{grep}
+and similarly with @value{SSED}:
+@c NOTE: in all examples, 'the@ the' is used to prevent
+@c 'make syntax-check' from complaining about double words.
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ cat two-cities-dup1.txt
+It was the best of times,
+it was the worst of times,
+it was the@ the age of wisdom,
+it was the age of foolishness,
+$ grep -E '\b(\w+)\s+\1\b' two-cities-dup1.txt
+it was the@ the age of wisdom,
+$ grep -n -E '\b(\w+)\s+\1\b' two-cities-dup1.txt
+:it was the@ the age of wisdom,
+$ sed -En '/\b(\w+)\s+\1\b/p' two-cities-dup1.txt
+it was the@ the age of wisdom,
+$ sed -En '/\b(\w+)\s+\1\b/@{=;p@}' two-cities-dup1.txt
+it was the@ the age of wisdom,
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@itemize @bullet
+@item
+The regular expression @samp{\b\w+\s+} searches for word-boundary (@samp{\b}),
+followed by one-or-more word-characters (@samp{\w+}), followed by whitespace
+(@samp{\s+}). @xref{regexp extensions}.
+@item
+Adding parentheses around the @samp{(\w+)} expression creates a subexpression.
+The regular expression pattern @samp{(PATTERN)\s+\1} defines a subexpression
+(in the parentheses) followed by a back-reference, separated by whitespace.
+A successful match means the @var{PATTERN} was repeated twice in succession.
+@xref{Back-references and Subexpressions}.
+@item
+The word-boundery expression (@samp{\b}) at both ends ensures partial
+words are not matched (e.g. @samp{the then} is not a desired match).
+@c Thanks to Jim for pointing this out in
+@c https://lists.gnu.org/archive/html/sed-devel/2016-12/msg00041.html
+@item
+The @option{-E} option enables extended regular expression syntax, alleviating
+the need to add backslashes before the parenthesis. @xref{ERE syntax}.
+@end itemize
+When the doubled word span two lines the above regular expression
+will not find them as @command{grep} and @command{sed} operate line-by-line.
+By using @command{N} and @command{D} commands, @command{sed} can apply
+regular expressions on multiple lines (that is, multiple lines are stored
+in the pattern space, and the regular expression works on it):
+@c NOTE: use 'the@*the' instead of a real new line to prevent
+@c 'make syntax-check' to complain about doubled-words.
+@codequoteundirected on
+@codequotebacktick on
+@example
+$ cat two-cities-dup2.txt
+It was the best of times, it was the
+worst of times, it was the@*the age of wisdom,
+it was the age of foolishness,
+$ sed -En '@{N; /\b(\w+)\s+\1\b/@{=;p@} ; D@}'  two-cities-dup2.txt
+worst of times, it was the@*the age of wisdom,
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@itemize @bullet
+@item
+The @command{N} command appends the next line to the pattern space
+(thus ensuring it contains two consecutive lines in every cycle).
+@item
+The regular expression uses @samp{\s+} for word separator which matches
+both spaces and newlines.
+@item
+The regular expression matches, the entire pattern space is printed
+with @command{p}. No lines are printed by default due to the @option{-n} option.
+@item
+The @command{D} removes the first line from the pattern space (up until the
+first newline), readying it for the next cycle.
+@end itemize
+See the GNU @command{coreutils} manual for an alternative solution using
+@command{tr -s} and @command{uniq} at
+@c NOTE: cheating and keeping the URL line shorter than 80 characters
+@c by using 'gnu.org' and '/s/'.
+@url{https://gnu.org/s/coreutils/manual/html_node/Squeezing-and-deleting.html}.
+@node Line length adjustment
+@section Line length adjustment
+This section uses @code{N} and @code{P} commands to read and write
+lines, and the @code{b} command for branching.
+@xref{Multiline techniques} and @ref{Branching and flow control}.
+This (somewhat contrived) example deal with formatting and wrapping
+lines of text of the following input file:
+@example
+@group
+$ cat two-cities-mix.txt
+It was the best of times, it was
+the worst of times, it
+was the age of
+wisdom,
+it
+was
+the age
+of foolishness,
+@end group
+@end example
+@exdent The following sed program wraps lines at 40 characters:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ cat wrap40.sed
+# outer loop
+:x
+# Append a newline followed by the next input line to the pattern buffer
+N
+# Remove all newlines from the pattern buffer
+s/\n/ /g
+# Inner loop
+:y
+# Add a newline after the first 40 characters
+s/(.@{40,40@})/\1\n/
+# If there is a newline in the pattern buffer
+# (i.e. the previous substitution added a newline)
+/\n/ @{
+    # There are newlines in the pattern buffer -
+    # print the content until the first newline.
+    P
+   # Remove the printed characters and the first newline
+   s/.*\n//
+   # branch to label 'y' - repeat inner loop
+   by
+ @}
+# No newlines in the pattern buffer - Branch to label 'x' (outer loop)
+# and read the next input line
+bx
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@exdent The wrapped output:
+@codequoteundirected on
+@codequotebacktick on
+@example
+@group
+$ sed -E -f wrap40.sed two-cities-mix.txt
+It was the best of times, it was the wor
+st of times, it was the age of wisdom, i
+t was the age of foolishness,
+@end group
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@node Adding a header to multiple files
+@section Adding a header to multiple files
+@value{SSED} can be used to safely modify multiple files at once.
+@exdent Add a single line to the beginning of source code files:
+@codequoteundirected on
+@codequotebacktick on
+@example
+sed -i '1i/* Copyright (C) FOO BAR */' *.c
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@exdent Adding a few lines is possible using @samp{\n} in the text:
+@codequoteundirected on
+@codequotebacktick on
+@example
+sed -i '1i/*\n * Copyright (C) FOO BAR\n * Created by Jane Doe\n */' *.c
+@end example
+@codequoteundirected off
+@codequotebacktick off
+To add multiple lines from another file, use @code{0rFILE}.
+A typical use case is adding a license notice header to all files:
+@codequoteundirected on
+@codequotebacktick on
+@example
+## Create the header file:
+$ cat<<'EOF'>LIC.TXT
+/*
+    Copyright (C) 1989-2021 FOO BAR
+    This program is free software; you can redistribute it and/or modify
+    it under the terms of the GNU General Public License as published by
+    the Free Software Foundation; either version 3, or (at your option)
+    any later version.
+    This program is distributed in the hope that it will be useful,
+    but WITHOUT ANY WARRANTY; without even the implied warranty of
+    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+    GNU General Public License for more details.
+    You should have received a copy of the GNU General Public License
+    along with this program; If not, see <https://www.gnu.org/licenses/>.
+*/
+EOF
+## Add the file at the beginning of all source code files:
+$ sed -i '0rLIC.TXT' *.cpp *.h
+@end example
+@codequoteundirected off
+@codequotebacktick off
+With script files (e.g. @file{.sh},@file{.py},@file{.pl} files)
+the license notice typically appears @emph{after} the first line (the
+'shebang' @samp{#!} line). The @code{1rFILE} command will add @file{FILE}
+@emph{after} the first line:
+@codequoteundirected on
+@codequotebacktick on
+@example
+## Create the header file:
+$ cat<<'EOF'>LIC.TXT
+##
+## Copyright (C) 1989-2021 FOO BAR
+##
+## This program is free software; you can redistribute it and/or modify
+## it under the terms of the GNU General Public License as published by
+## the Free Software Foundation; either version 3, or (at your option)
+## any later version.
+##
+## This program is distributed in the hope that it will be useful,
+## but WITHOUT ANY WARRANTY; without even the implied warranty of
+## MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+## GNU General Public License for more details.
+##
+## You should have received a copy of the GNU General Public License
+## along with this program; If not, see <https://www.gnu.org/licenses/>.
+##
+##
+EOF
+## Add the file at the beginning of all source code files:
+$ sed -i '1rLIC.TXT' *.py *.sh
+@end example
+@codequoteundirected off
+@codequotebacktick off
+The above @command{sed} commands can be combined with @command{find}
+to locate files in all subdirectories, @command{xargs} to run additional
+commands on selected files and @command{grep} to filter out files that already
+contain a copyright notice:
+@codequoteundirected on
+@codequotebacktick on
+@example
+find \( -iname '*.cpp' -o -iname '*.c' -o -iname '*.h' \) \
+    | xargs grep -Li copyright \
+    | xargs -r sed -i '0rLIC.TXT'
+@end example
+@codequoteundirected off
+@codequotebacktick off
+@exdent Or a slightly safe version (handling files with spaces and newlines):
+@codequoteundirected on
+@codequotebacktick on
+@example
+find \( -iname '*.cpp' -o -iname '*.c' -o -iname '*.h' \) -print0 \
+    | xargs -0 grep -Z -Li copyright \
+    | xargs -0 -r sed -i '0rLIC.TXT'
+@end example
+@codequoteundirected off
+@codequotebacktick off
+Note: using the @code{0} address with @code{r} command requires @value{SSED}
+version 4.9 or later. @xref{Zero Address}.
 @node tac
 @section Reverse Lines of Files
 …
 is a @command{tac} workalike.
+Note that on implementations other than @acronym{GNU} @command{sed}
+@ifset PERL
+and @value{SSED}
+@end ifset
+Note that on implementations other than GNU @command{sed}
 this script might easily overflow internal buffers.
 …
 This script replaces @samp{cat -n}; in fact it formats its output
 exactly like @acronym{GNU} @command{cat} does.
+exactly like GNU @command{cat} does.
 Of course this is completely useless and for two reasons:  first,
 …
 @group
 # Convert words to a's
 s/[ @kbd{tab}][ @kbd{tab}]*/ /g
+s/[ @kbd{@key{TAB}}][ @kbd{@key{TAB}}]*/ /g
 s/^/ /
 s/ [^ ][^ ]*/a /g
 …
 @c end---------------------------------------------
 As you can see, we mantain a 2-line window using @code{P} and @code{D}.
+As you can see, we maintain a 2-line window using @code{P} and @code{D}.
 This technique is often used in advanced @command{sed} scripts.
 …
 fastest.  Note that loops are completely done with @code{n} and
 @code{b}, without relying on @command{sed} to restart the
 the script automatically at the end of a line.
+script automatically at the end of a line.
 @c start-------------------------------------------
 …
 # get next
+n
 # got chars? print it again, etc...
+# got chars? print it again, etc...
 /./bx
 @end group
 …
 @chapter @value{SSED}'s Limitations and Non-limitations
 @cindex @acronym{GNU} extensions, unlimited line length
+@cindex GNU extensions, unlimited line length
 @cindex Portability, line length limitations
 For those who want to write portable @command{sed} scripts,
 …
 the size of the buffer that can be processed by certain patterns.
-@ifset PERL
-There are some size limitations in the regular expression
-matcher but it is hoped that they will never in practice
-be relevant.  The maximum length of a compiled pattern
-is 65539 (sic) bytes.  All values in repeating quantifiers
-must be less than 65536.  The maximum nesting depth of
-all parenthesized subpatterns, including capturing and
-non-capturing subpatterns@footnote{The
-distinction is meaningful when referring to Perl-style
-regular expressions.}, assertions, and other types of
-subpattern, is 200.
-Also, @value{SSED} recognizes the @sc{posix} syntax
-@code{[.@var{ch}.]} and @code{[=@var{ch}=]}
-where @var{ch} is a ``collating element'', but these
-are not supported, and an error is given if they are
-encountered.
-Here are a few distinctions between the real Perl-style
-regular expressions and those that @option{-R} recognizes.
-@enumerate
-@item
-Lookahead assertions do not allow repeat quantifiers after them
-Perl permits them, but they do not mean what you
-might think. For example, @samp{(?!a)@{3@}} does not assert that the
-next three characters are not @samp{a}. It just asserts three times that the
-next character is not @samp{a} --- a waste of time and nothing else.
-@item
-Capturing subpatterns that occur inside  negative  lookahead
-head  assertions  are  counted,  but  their  entries are counted
-as empty in the second half of an @code{s} command.
-Perl sets its numerical variables from any such patterns
-that are matched before the assertion fails to match
-something (thereby succeeding), but only if the negative
-lookahead assertion contains just one branch.
-@item
-The following Perl escape sequences are not supported:
-@samp{\l}, @samp{\u}, @samp{\L}, @samp{\U}, @samp{\E},
-@samp{\Q}. In fact these are implemented by Perl's general
-string-handling and are not part of its pattern matching engine.
-@item
-The Perl @samp{\G} assertion is not supported as it is not
-relevant to single pattern matches.
-@item
-Fairly obviously, @value{SSED} does not support the @samp{(?@{code@})}
-and @samp{(?p@{code@})} constructions. However, there is some experimental
-support for recursive patterns using the non-Perl item @samp{(?R)}.
-@item
-There are at the time of writing some oddities in Perl
-.005_02 concerned with the settings of captured strings
-when part of a pattern is repeated. For example, matching
-@samp{aba} against the pattern @samp{/^(a(b)?)+$/} sets
-@samp{$2}@footnote{@samp{$2} would be @samp{\2} in @value{SSED}.}
-to the value @samp{b}, but matching @samp{aabbaa}
-against @samp{/^(aa(bb)?)+$/} leaves @samp{$2}
-unset.  However, if the pattern is changed to
-@samp{/^(aa(b(b))?)+$/} then @samp{$2} (and @samp{$3}) are set.
-In Perl 5.004 @samp{$2} is set in both cases, and that is also
-true of @value{SSED}.
-@item
-Another as yet unresolved discrepancy is that in Perl
-.005_02 the pattern @samp{/^(a)?(?(1)a|b)+$/} matches
-the string @samp{a}, whereas in @value{SSED} it does not.
-However, in both Perl and @value{SSED} @samp{/^(a)?a/} matched
-against @samp{a} leaves $1 unset.
-@end enumerate
-@end ifset
 @node Other Resources
 @chapter Other Resources for Learning About @command{sed}
+For up to date information about @value{SSED} please
+visit @uref{https://www.gnu.org/software/sed/}.
+Send general questions and suggestions to @email{sed-devel@@gnu.org}.
+Visit the mailing list archives for past discussions at
+@uref{https://lists.gnu.org/archive/html/sed-devel/}.
 @cindex Additional reading about @command{sed}
+In addition to several books that have been written about @command{sed}
+(either specifically or as chapters in books which discuss
+shell programming), one can find out more about @command{sed}
+(including suggestions of a few books) from the FAQ
+for the @code{sed-users} mailing list, available from any of:
+@display
+ @uref{http://www.student.northpark.edu/pemente/sed/sedfaq.html}
+ @uref{http://sed.sf.net/grabbag/tutorials/sedfaq.html}
+@end display
+Also of interest are
+@uref{http://www.student.northpark.edu/pemente/sed/index.htm}
+and @uref{http://sed.sf.net/grabbag},
+which include @command{sed} tutorials and other @command{sed}-related goodies.
+The @code{sed-users} mailing list itself maintained by Sven Guckes.
+To subscribe, visit @uref{http://groups.yahoo.com} and search
+for the @code{sed-users} mailing list.
+The following resources provide information about @command{sed}
+(both @value{SSED} and other variations). Note these not maintained by
+@value{SSED} developers.
+@itemize @bullet
+@item
+sed @code{$HOME}: @uref{http://sed.sf.net}
+@item
+sed FAQ: @uref{http://sed.sf.net/sedfaq.html}
+@item
+seder's grabbag: @uref{http://sed.sf.net/grabbag}
+@item
+The @code{sed-users} mailing list maintained by Sven Guckes:
+@uref{http://groups.yahoo.com/group/sed-users/}
+(note this is @emph{not} the @value{SSED} mailing list).
+@end itemize
 @node Reporting Bugs
 …
 @cindex Bugs, reporting
+Email bug reports to @email{bonzini@@gnu.org}.
+Be sure to include the word ``sed'' somewhere in the @code{Subject:} field.
+Email bug reports to @email{bug-sed@@gnu.org}.
 Also, please include the output of @samp{sed --version} in the body
 of your report if at all possible.
 …
 @example
 @i{while building frobme-1.3.4}
 $ configure
+@i{@i{@r{while building frobme-1.3.4}}}
+$ configure
 @error{} sed: file sedscr line 1: Unknown option to 's'
 @end example
 …
 @table @asis
+@anchor{N_command_last_line}
 @item @code{N} command on the last line
 @cindex Portability, @code{N} command on the last line
 …
 the @command{-n} command switch has been specified.  This choice is
 by design.
+Default behavior (gnu extension, non-POSIX conforming):
+@example
+$ seq 3 | sed N
+@end example
+@noindent
+To force POSIX-conforming behavior:
+@example
+$ seq 3 | sed --posix N
+@end example
 For example, the behavior of
 …
 /foo/@{ N;N;N;N;N;N;N;N;N; @}
 @end example
 @cindex @code{POSIXLY_CORRECT} behavior, @code{N} command
 In any case, the simplest workaround is to use @code{$d;N} in
 …
 @item Regex syntax clashes (problems with backslashes)
 @cindex @acronym{GNU} extensions, to basic regular expressions
+@cindex GNU extensions, to basic regular expressions
 @cindex Non-bugs, regex syntax clashes
 @command{sed} uses the @sc{posix} basic regular expression syntax.  According to
 …
 @code{\>}, @code{\b}, @code{\B}, @code{\w}, and @code{\W}.
 As in all @acronym{GNU} programs that use @sc{posix} basic regular
+As in all GNU programs that use @sc{posix} basic regular
 expressions, @command{sed} interprets these escape sequences as special
 characters.  So, @code{x\+} matches one or more occurrences of @samp{x}.
 …
 spurious backslashes if they are to be used with modern implementations
 of @command{sed}, like
+@ifset PERL
+@value{SSED} or
+@end ifset
+@acronym{GNU} @command{sed}.
+GNU @command{sed}.
 On the other hand, some scripts use s|abc\|def||g to remove occurrences
 …
 @command{sed} 4.0.x, newer versions interpret this as removing the
 string @code{abc|def}.  This is again undefined behavior according to
 @acronym{POSIX}, and this interpretation is arguably more robust: older
+POSIX, and this interpretation is arguably more robust: older
 @command{sed}s, for example, required that the regex matcher parsed
 @code{\/} as @code{/} in the common case of escaping a slash, which is
 …
 because the regex matcher is only partially under our control.
 @cindex @acronym{GNU} extensions, special escapes
+@cindex GNU extensions, special escapes
 In addition, this version of @command{sed} supports several escape characters
 (some of which are multi-character) to insert non-printable characters
 …
 (@pxref{Invoking sed, , Invocation}) lets you clobber
 protected files.  This is not a bug, but rather a consequence
 of how the Unix filesystem works.
+of how the Unix file system works.
 The permissions on a file say what can happen to the data
 …
 modifying the contents of the directory, so the operation depends on
 the permissions of the directory, not of the file.  For this same
 reason, @command{sed} does not let you use @option{-i} on a writeable file
 in a read-only directory (but unbelievably nobody reports that as a
 bug@dots{}).
+reason, @command{sed} does not let you use @option{-i} on a writable file
+in a read-only directory, and will break hard or symbolic links when
+@option{-i} is used on such a file.
 @item @code{0a} does not work (gives an error)
+@cindex @code{0} address
+@cindex GNU extensions, @code{0} address
+@cindex Non-bugs, @code{0} address
 There is no line 0.  0 is a special address that is only used to treat
 addresses like @code{0,/@var{RE}/} as active when the script starts: if
 you write @code{1,/abc/d} and the first line includes the word @samp{abc},
+you write @code{1,/abc/d} and the first line includes the string @samp{abc},
 then that match would be ignored because address ranges must span at least
 two lines (barring the end of the file); but what you probably wanted is
 …
 @ifclear PERL
 @item @code{[a-z]} is case insensitive
+@cindex Non-bugs, localization-related
 You are encountering problems with locales.  POSIX mandates that @code{[a-z]}
 uses the current locale's collation order -- in C parlance, that means using
 @code{strcoll(3)} instead of @code{strcmp(3)}.  Some locales have a
+case-insensitive collation order, others don't: one of those that have
+problems is Estonian.
+case-insensitive collation order, others don't.
 Another problem is that @code{[a-z]} tries to use collation symbols.
 This only happens if you are on the @acronym{GNU} system, using
 @acronym{GNU} libc's regular expression matcher instead of compiling the
 one supplied with @acronym{GNU} sed.  In a Danish locale, for example,
+This only happens if you are on the GNU system, using
+GNU libc's regular expression matcher instead of compiling the
+one supplied with GNU sed.  In a Danish locale, for example,
 the regular expression @code{^[a-z]$} matches the string @samp{aa},
 because this is a single collating symbol that comes after @samp{a}
 …
 To work around these problems, which may cause bugs in shell scripts, set
 the @env{LC_COLLATE} and @env{LC_CTYPE} environment variables to @samp{C}.
+@item @code{s/.*//} does not clear pattern space
+@cindex Non-bugs, localization-related
+@cindex @value{SSEDEXT}, emptying pattern space
+@cindex Emptying pattern space
+This happens if your input stream includes invalid multibyte
+sequences.  @sc{posix} mandates that such sequences
+are @emph{not} matched by @samp{.}, so that @samp{s/.*//} will not clear
+pattern space as you would expect.  In fact, there is no way to clear
+sed's buffers in the middle of the script in most multibyte locales
+(including UTF-8 locales).  For this reason, @value{SSED} provides a `z'
+command (for `zap') as an extension.
+To work around these problems, which may cause bugs in shell scripts, set
+the @env{LC_COLLATE} and @env{LC_CTYPE} environment variables to @samp{C}.
 @end ifclear
 @end table
+@node Extended regexps
+@appendix Extended regular expressions
+@cindex Extended regular expressions, syntax
+The only difference between basic and extended regular expressions is in
+the behavior of a few characters: @samp{?}, @samp{+}, parentheses,
+and braces (@samp{@{@}}).  While basic regular expressions require
+these to be escaped if you want them to behave as special characters,
+when using extended regular expressions you must escape them if
+you want them @emph{to match a literal character}.
+@noindent
+Examples:
+@table @code
+@item abc?
+becomes @samp{abc\?} when using extended regular expressions.  It matches
+the literal string @samp{abc?}.
+@item c\+
+becomes @samp{c+} when using extended regular expressions.  It matches
+one or more @samp{c}s.
+@item a\@{3,\@}
+becomes @samp{a@{3,@}} when using extended regular expressions.  It matches
+three or more @samp{a}s.
+@item \(abc\)\@{2,3\@}
+becomes @samp{(abc)@{2,3@}} when using extended regular expressions.  It
+matches either @samp{abcabc} or @samp{abcabcabc}.
+@item \(abc*\)\1
+becomes @samp{(abc*)\1} when using extended regular expressions.
+Backreferences must still be escaped when using extended regular
+expressions.
+@end table
+@ifset PERL
+@node Perl regexps
+@appendix Perl-style regular expressions
+@cindex Perl-style regular expressions, syntax
+@emph{This part is taken from the @file{pcre.txt} file distributed together
+with the free @sc{pcre} regular expression matcher; it was written by Philip Hazel.}
+Perl introduced several extensions to regular expressions, some
+of them incompatible with the syntax of regular expressions
+accepted by Emacs and other @acronym{GNU} tools (whose matcher was
+based on the Emacs matcher).  @value{SSED} implements
+both kinds of extensions.
+@iftex
+Summarizing, we have:
+@itemize @bullet
+@item
+A backslash can introduce several special sequences
+@item
+The circumflex, dollar sign, and period characters behave specially
+with regard to new lines
+@item
+Strange uses of square brackets are parsed differently
+@item
+You can toggle modifiers in the middle of a regular expression
+@item
+You can specify that a subpattern does not count when numbering backreferences
+@item
+@cindex Greedy regular expression matching
+You can specify greedy or non-greedy matching
+@item
+You can have more than ten back references
+@item
+You can do complex look aheads and look behinds (in the spirit of
+@code{\b}, but with subpatterns).
+@item
+You can often improve performance by avoiding that @command{sed} wastes
+time with backtracking
+@item
+You can have if/then/else branches
+@item
+You can do recursive matches, for example to look for unbalanced parentheses
+@item
+You can have comments and non-significant whitespace, because things can
+get complex...
+@end itemize
+Most of these extensions are introduced by the special @code{(?}
+sequence, which gives special meanings to parenthesized groups.
+@end iftex
+@menu
+Other extensions can be roughly subdivided in two categories
+On one hand Perl introduces several more escaped sequences
+(that is, sequences introduced by a backslash).  On the other
+hand, it specifies that if a question mark follows an open
+parentheses it should give a special meaning to the parenthesized
+group.
+* Backslash::                       Introduces special sequences
+* Circumflex/dollar sign/period::   Behave specially with regard to new lines
+* Square brackets::                 Are a bit different in strange cases
+* Options setting::                 Toggle modifiers in the middle of a regexp
+* Non-capturing subpatterns::       Are not counted when backreferencing
+* Repetition::                      Allows for non-greedy matching
+* Backreferences::                  Allows for more than 10 back references
+* Assertions::                      Allows for complex look ahead matches
+* Non-backtracking subpatterns::    Often gives more performance
+* Conditional subpatterns::         Allows if/then/else branches
+* Recursive patterns::              For example to match parentheses
+* Comments::                        Because things can get complex...
+@end menu
+@node Backslash
+@appendixsec Backslash
+@cindex Perl-style regular expressions, escaped sequences
+There are a few difference in the handling of backslashed
+sequences in Perl mode.
+First of all, there are no @code{\o} and @code{\d} sequences.
+@sc{ascii} values for characters can be specified in octal
+with a @code{\@var{xxx}} sequence, where @var{xxx} is a
+sequence of up to three octal digits.  If the first digit
+is a zero, the treatment of the sequence is straightforward;
+just note that if the character that follows the escaped digit
+is itself an octal digit, you have to supply three octal digits
+for @var{xxx}.  For example @code{\07} is a @sc{bel} character
+rather than a @sc{nul} and a literal @code{7} (this sequence is
+instead represented by @code{\0007}).
+@cindex Perl-style regular expressions, backreferences
+The handling of a backslash followed by a digit other than 0
+is complicated.  Outside a character class, @command{sed} reads it
+and any following digits as a decimal number. If the number
+is less than 10, or if there have been at least that many
+previous capturing left parentheses in the expression, the
+entire sequence is taken as a back reference. A description
+of how this works is given later, following the discussion
+of parenthesized subpatterns.
+Inside a character class, or if the decimal number is
+greater than 9 and there have not been that many capturing
+subpatterns, @command{sed} re-reads up to three octal digits following
+the backslash, and generates a single byte from the
+least significant 8 bits of the value. Any subsequent digits
+stand for themselves.  For example:
+@example
+     \040  @i{is another way of writing a space}
+     \40   @i{is the same, provided there are fewer than 40}
+           @i{previous capturing subpatterns}
+     \7    @i{is always a back reference}
+     \011  @i{is always a tab}
+     \11   @i{might be a back reference, or another way of}
+           @i{writing a tab}
+     \0113 @i{is a tab followed by the character @samp{3}}
+     \113  @i{is the character with octal code 113 (since there}
+           @i{can be no more than 99 back references)}
+     \377  @i{is a byte consisting entirely of 1 bits (@sc{ascii} 255)}
+     \81   @i{is either a back reference, or a binary zero}
+           @i{followed by the two characters @samp{81}}
+@end example
+Note that octal values of 100 or greater must not be introduced
+duced by a leading zero, because no more than three octal
+digits are ever read.
+All the sequences that define a single byte value can be
+used both inside and outside character classes. In addition,
+inside a character class, the sequence @code{\b} is interpreted
+as the backspace character (hex 08). Outside a character
+class it has a different meaning (see below).
+In addition, there are four additional escapes specifying
+generic character classes (like @code{\w} and @code{\W} do):
+@cindex Perl-style regular expressions, character classes
+@table @samp
+@item \d
+Matches any decimal digit
+@item \D
+Matches any character that is not a decimal digit
+@end table
+In Perl mode, these character type sequences can appear both inside and
+outside character classes. Instead, in @sc{posix} mode these sequences
+(as well as @code{\w} and @code{\W}) are treated as two literal characters
+(a backslash and a letter) inside square brackets.
+Escaped sequences specifying assertions are also different in
+Perl mode.  An assertion specifies a condition that has to be met
+at a particular point in a match, without consuming any
+characters from the subject string. The use of subpatterns
+for more complicated assertions is described below.  The
+backslashed assertions are
+@cindex Perl-style regular expressions, assertions
+@table @samp
+@item \b
+Asserts that the point is at a word boundary.
+A word boundary is a position in the subject string where
+the current character and the previous character do not both
+match @code{\w} or @code{\W} (i.e. one matches @code{\w} and
+the other matches @code{\W}), or the start or end of the string
+if the first or last character matches @code{\w}, respectively.
+@item \B
+Asserts that the point is not at a word boundary.
+@item \A
+Asserts the matcher is at the start of pattern space (independent
+of multiline mode).
+@item \Z
+Asserts the matcher is at the end of pattern space,
+or at a newline before the end of pattern space (independent of
+multiline mode)
+@item \z
+Asserts the matcher is at the end of pattern space (independent
+of multiline mode)
+@end table
+These assertions may not appear in character classes (but
+note that @code{\b} has a different meaning, namely the
+backspace character, inside a character class).
+Note that Perl mode does not support directly assertions
+for the beginning and the end of word; the @acronym{GNU} extensions
+@code{\<} and @code{\>} achieve this purpose in @sc{posix} mode
+instead.
+The @code{\A}, @code{\Z}, and @code{\z} assertions differ
+from the traditional circumflex and dollar sign (described below)
+in that they only ever match at the very start and end of the
+subject string, whatever options are set; in particular @code{\A}
+and @code{\z} are the same as the @acronym{GNU} extensions
+@code{\`} and @code{\'} that are active in @sc{posix} mode.
+@node Circumflex/dollar sign/period
+@appendixsec Circumflex, dollar sign, period
+@cindex Perl-style regular expressions, newlines
+Outside a character class, in the default matching mode, the
+circumflex character is an assertion which is true only if
+the current matching point is at the start of the subject
+string.  Inside a character class, the circumflex has an entirely
+different meaning (see below).
+The circumflex need not be the first character of the pattern if
+a number of alternatives are involved, but it should be the
+first thing in each alternative in which it appears if the
+pattern is ever to match that branch. If all possible alternatives,
+start with a circumflex, that is, if the pattern is
+constrained to match only at the start of the subject, it is
+said to be an @dfn{anchored} pattern. (There are also other constructs
+structs that can cause a pattern to be anchored.)
+A dollar sign is an assertion which is true only if the
+current matching point is at the end of the subject string,
+or immediately before a newline character that is the last
+character in the string (by default).  A dollar sign need not be the
+last character of the pattern if a number of alternatives
+are involved, but it should be the last item in any branch
+in which it appears.  A dollar sign has no special meaning in a
+character class.
+@cindex Perl-style regular expressions, multiline
+The meanings of the circumflex and dollar sign characters are
+changed if the @code{M} modifier option is used. When this is
+the case, they match immediately after and immediately
+before an internal @code{\n} character, respectively, in addition
+to matching at the start and end of the subject string.  For
+example, the pattern @code{/^abc$/} matches the subject string
+@samp{def\nabc} in multiline mode, but not otherwise.  Consequently,
+patterns that are anchored in single line mode
+because all branches start with @code{^} are not anchored in
+multiline mode.
+@cindex Perl-style regular expressions, multiline
+Note that the sequences @code{\A}, @code{\Z}, and @code{\z}
+can be used to match the start and end of the subject in both
+modes, and if all branches of a pattern start with @code{\A}
+is it always anchored, whether the @code{M} modifier is set or not.
+@cindex Perl-style regular expressions, single line
+Outside a character class, a dot in the pattern matches any
+one character in the subject, including a non-printing character,
+but not (by default) newline.  If the @code{S} modifier is used,
+dots match newlines as well.  Actually, the handling of
+dot is entirely independent of the handling of circumflex
+and dollar sign, the only relationship being that they both
+involve newline characters. Dot has no special meaning in a
+character class.
+@node Square brackets
+@appendixsec Square brackets
+@cindex Perl-style regular expressions, character classes
+An opening square bracket introduces a character class, terminated
+by a closing square bracket.  A closing square bracket on its own
+is not special.  If a closing square bracket is required as a
+member of the class, it should be the first data character in
+the class (after an initial circumflex, if present) or escaped with a backslash.
+A character class matches a single character in the subject;
+the character must be in the set of characters defined by
+the class, unless the first character in the class is a circumflex,
+in which case the subject character must not be in
+the set defined by the class. If a circumflex is actually
+required as a member of the class, ensure it is not the
+first character, or escape it with a backslash.
+For example, the character class [aeiou] matches any lower
+case vowel, while [^aeiou] matches any character that is not
+a lower case vowel. Note that a circumflex is just a convenient
+venient notation for specifying the characters which are in
+the class by enumerating those that are not. It is not an
+assertion: it still consumes a character from the subject
+string, and fails if the current pointer is at the end of
+the string.
+@cindex Perl-style regular expressions, case-insensitive
+When caseless matching is set, any letters in a class
+represent both their upper case and lower case versions, so
+for example, a caseless @code{[aeiou]} matches uppercase
+and lowercase @samp{A}s, and a caseless @code{[^aeiou]}
+does not match @samp{A}, whereas a case-sensitive version would.
+@cindex Perl-style regular expressions, single line
+@cindex Perl-style regular expressions, multiline
+The newline character is never treated in any special way in
+character classes, whatever the setting of the @code{S} and
+@code{M} options (modifiers) is.  A class such as @code{[^a]} will
+always match a newline.
+The minus (hyphen) character can be used to specify a range
+of characters in a character class.  For example, @code{[d-m]}
+matches any letter between d and m, inclusive.  If a minus
+character is required in a class, it must be escaped with a
+backslash or appear in a position where it cannot be interpreted
+as indicating a range, typically as the first or last
+character in the class.
+It is not possible to have the literal character @code{]} as the
+end character of a range.  A pattern such as @code{[W-]46]} is
+interpreted as a class of two characters (@code{W} and @code{-})
+followed by a literal string @code{46]}, so it would match
+@samp{W46]} or @samp{-46]}. However, if the @code{]} is escaped
+with a backslash it is interpreted as the end of range, so
+@code{[W-\]46]} is interpreted as a single class containing a
+range followed by two separate characters. The octal or
+hexadecimal representation of @code{]} can also be used to end a range.
+Ranges operate in @sc{ascii} collating sequence. They can also be
+used for characters specified numerically, for example
+@code{[\000-\037]}. If a range that includes letters is used when
+caseless matching is set, it matches the letters in either
+case. For example, a caseless @code{[W-c]} is equivalent to
+@code{[][\^_`wxyzabc]}, matched caselessly, and if character
+tables for the French locale are in use, @code{[\xc8-\xcb]}
+matches accented E characters in both cases.
+Unlike in @sc{posix} mode, the character types @code{\d},
+@code{\D}, @code{\s}, @code{\S}, @code{\w}, and @code{\W}
+may also appear in a character class, and add the characters
+that they match to the class. For example, @code{[\dABCDEF]} matches any
+hexadecimal digit.  A circumflex can conveniently be used
+with the upper case character types to specify a more restricted
+set of characters than the matching lower case type.
+For example, the class @code{[^\W_]} matches any letter or digit,
+but not underscore.
+All non-alphameric characters other than @code{\}, @code{-},
+@code{^} (at the start) and the terminating @code{]}
+are non-special in character classes, but it does no harm
+if they are escaped.
+Perl 5.6 supports the @sc{posix} notation for character classes, which
+uses names enclosed by @code{[:} and @code{:]} within the enclosing
+square brackets, and @value{SSED} supports this notation as well.
+For example,
+@example
+     [01[:alpha:]%]
+@end example
+@noindent
+matches @samp{0}, @samp{1}, any alphabetic character, or @samp{%}.
+The supported class names are
+@table @code
+@item alnum
+Matches letters and digits
+@item alpha
+Matches letters
+@item ascii
+Matches character codes 0 - 127
+@item cntrl
+Matches control characters
+@item digit
+Matches decimal digits (same as \d)
+@item graph
+Matches printing characters, excluding space
+@item lower
+Matches lower case letters
+@item print
+Matches printing characters, including space
+@item punct
+Matches printing characters, excluding letters and digits
+@item space
+Matches white space (same as \s)
+@item upper
+Matches upper case letters
+@item word
+Matches ``word'' characters (same as \w)
+@item xdigit
+Matches hexadecimal digits
+@end table
+The names @code{ascii} and @code{word} are extensions valid only in
+Perl mode.  Another Perl extension is negation, which is
+indicated by a circumflex character after the colon. For example,
+@example
+     [12[:^digit:]]
+@end example
+@noindent
+matches @samp{1}, @samp{2}, or any non-digit.
+@node Options setting
+@appendixsec Options setting
+@cindex Perl-style regular expressions, toggling options
+@cindex Perl-style regular expressions, case-insensitive
+@cindex Perl-style regular expressions, multiline
+@cindex Perl-style regular expressions, single line
+@cindex Perl-style regular expressions, extended
+The settings of the @code{I}, @code{M}, @code{S}, @code{X}
+modifiers can be changed from within the pattern by
+a sequence of Perl option letters enclosed between @code{(?}
+and @code{)}. The option letters must be lowercase.
+For example, @code{(?im)} sets caseless, multiline matching. It is
+also possible to unset these options by preceding the letter
+with a hyphen; you can also have combined settings and unsettings:
+@code{(?im-sx)} sets caseless and multiline matching,
+while unsets single line matching (for dots) and extended
+whitespace interpretation.  If a letter appears both before
+and after the hyphen, the option is unset.
+The scope of these option changes depends on where in the
+pattern the setting occurs. For settings that are outside
+any subpattern (defined below), the effect is the same as if
+the options were set or unset at the start of matching. The
+following patterns all behave in exactly the same way:
+@example
+     (?i)abc
+     a(?i)bc
+     ab(?i)c
+     abc(?i)
+@end example
+which in turn is the same as specifying the pattern abc with
+the @code{I} modifier.  In other words, ``top level'' settings
+apply to the whole pattern (unless there are other
+changes inside subpatterns). If there is more than one setting
+of the same option at top level, the rightmost setting
+is used.
+If an option change occurs inside a subpattern, the effect
+is different.  This is a change of behaviour in Perl 5.005.
+An option change inside a subpattern affects only that part
+of the subpattern @emph{that follows} it, so
+@example
+     (a(?i)b)c
+@end example
+@noindent
+matches abc and aBc and no other  strings  (assuming
+case-sensitive matching is used).  By this means, options can
+be made to have different settings in different parts of the
+pattern.  Any changes made in one alternative do carry on
+into subsequent branches within the same subpattern.  For
+example,
+@example
+     (a(?i)b|c)
+@end example
+@noindent
+matches @samp{ab}, @samp{aB}, @samp{c}, and @samp{C},
+even though when matching @samp{C} the first branch is
+abandoned before the option setting.
+This is because the effects of option settings happen at
+compile time. There would be some very weird behaviour otherwise.
+@ignore
+There are two PCRE-specific options PCRE_UNGREEDY and PCRE_EXTRA
+that can be changed in the same way as the Perl-compatible options by
+using the characters U and X respectively.  The (?X) flag
+setting is special in that it must always occur earlier in
+the pattern than any of the additional features it turns on,
+even when it is at top level. It is best put at the start.
+@end ignore
+@node Non-capturing subpatterns
+@appendixsec Non-capturing subpatterns
+@cindex Perl-style regular expressions, non-capturing subpatterns
+Marking part of a pattern as a subpattern does two things.
+On one hand, it localizes a set of alternatives; on the other
+hand, it sets up the subpattern as a capturing subpattern (as
+defined above).  The subpattern can be backreferenced and
+referenced in the right side of @code{s} commands.
+For example, if the string @samp{the red king} is matched against
+the pattern
+@example
+     the ((red|white) (king|queen))
+@end example
+@noindent
+the captured substrings are @samp{red king}, @samp{red},
+and @samp{king}, and are numbered 1, 2, and 3.
+The fact that plain parentheses fulfil two functions is not
+always helpful.  There are often times when a grouping
+subpattern is required without a capturing requirement.  If an
+opening parenthesis is followed by @code{?:}, the subpattern does
+not do any capturing, and is not counted when computing the
+number of any subsequent capturing subpatterns. For example,
+if the string @samp{the white queen} is matched against the pattern
+@example
+     the ((?:red|white) (king|queen))
+@end example
+@noindent
+the captured substrings are @samp{white queen} and @samp{queen},
+and are numbered 1 and 2. The maximum number of captured
+substrings is 99, while the maximum number of all subpatterns,
+both capturing and non-capturing, is 200.
+As a convenient shorthand, if any option settings are
+equired at the start of a non-capturing subpattern, the
+option letters may appear between the @code{?} and the
+@code{:}.  Thus the two patterns
+@example
+   (?i:saturday|sunday)
+   (?:(?i)saturday|sunday)
+@end example
+@noindent
+match exactly the same set of strings.  Because alternative
+branches are tried from left to right, and options are not
+reset until the end of the subpattern is reached, an option
+setting in one branch does affect subsequent branches, so
+the above patterns match @samp{SUNDAY} as well as @samp{Saturday}.
+@node Repetition
+@appendixsec Repetition
+@cindex Perl-style regular expressions, repetitions
+Repetition is specified by quantifiers, which can follow any
+of the following items:
+@itemize @bullet
+@item
+a single character, possibly escaped
+@item
+the @code{.} special character
+@item
+a character class
+@item
+a back reference (see next section)
+@item
+a parenthesized subpattern (unless it is an assertion; @pxref{Assertions})
+@end itemize
+The general repetition quantifier specifies a minimum and
+maximum number of permitted matches, by giving the two
+numbers in curly brackets (braces), separated by a comma.
+The numbers must be less than 65536, and the first must be
+less than or equal to the second. For example:
+@example
+     z@{2,4@}
+@end example
+@noindent
+matches @samp{zz}, @samp{zzz}, or @samp{zzzz}. A closing brace on its own
+is not a special character. If the second number is omitted,
+but the comma is present, there is no upper limit; if the
+second number and the comma are both omitted, the quantifier
+specifies an exact number of required matches. Thus
+@example
+     [aeiou]@{3,@}
+@end example
+@noindent
+matches at least 3 successive vowels, but may match many
+more, while
+@example
+     \d@{8@}
+@end example
+@noindent
+matches exactly 8 digits.  An opening curly bracket that
+appears in a position where a quantifier is not allowed, or
+one that does not match the syntax of a quantifier, is taken
+as a literal character. For example, @{,6@} is not a quantifier,
+but a literal string of four characters.@footnote{It
+raises an error if @option{-R} is not used.}
+The quantifier @samp{@{0@}} is permitted, causing the expression to
+behave as if the previous item and the quantifier were not
+present.
+For convenience (and historical compatibility) the three
+most common quantifiers have single-character abbreviations:
+@table @code
+@item *
+is equivalent to @{0,@}
+@item +
+is equivalent to @{1,@}
+@item ?
+is equivalent to @{0,1@}
+@end table
+It is possible to construct infinite loops by following a
+subpattern that can match no characters with a quantifier
+that has no upper limit, for example:
+@example
+     (a?)*
+@end example
+Earlier versions of Perl used to give an error at
+compile time for such patterns. However, because there are
+cases where this can be useful, such patterns are now
+accepted, but if any repetition of the subpattern does in
+fact match no characters, the loop is forcibly broken.
+@cindex Greedy regular expression matching
+@cindex Perl-style regular expressions, stingy repetitions
+By default, the quantifiers are @dfn{greedy} like in @sc{posix}
+mode, that is, they match as much as possible (up to the maximum
+number of permitted times), without causing the rest of the
+pattern to fail. The classic example of where this gives problems
+is in trying to match comments in C programs. These appear between
+the sequences @code{/*} and @code{*/} and within the sequence, individual
+@code{*} and @code{/} characters may appear. An attempt to match C
+comments by applying the pattern
+@example
+     /\*.*\*/
+@end example
+@noindent
+to the string
+@example
+     /* first command */ not comment /* second comment */
+@end example
+@noindent
+fails, because it matches the entire string owing to the
+greediness of the @code{.*} item.
+However, if a quantifier is followed by a question mark, it
+ceases to be greedy, and instead matches the minimum number
+of times possible, so the pattern @code{/\*.*?\*/}
+does the right thing with the C comments. The meaning of the
+various quantifiers is not otherwise changed, just the preferred
+number of matches.  Do not confuse this use of question
+mark with its use as a quantifier in its own right.
+Because it has two uses, it can sometimes appear doubled, as in
+@example
+     \d??\d
+@end example
+which matches one digit by preference, but can match two if
+that is the only way the rest of the pattern matches.
+Note that greediness does not matter when specifying addresses,
+but can be nevertheless used to improve performance.
+@ignore
+   If the PCRE_UNGREEDY option is set (an option which is not
+   available in Perl), the quantifiers are not greedy by
+   default, but individual ones can be made greedy by following
+   them with a question mark. In other words, it inverts the
+   default behaviour.
+@end ignore
+When a parenthesized subpattern is quantified with a minimum
+repeat count that is greater than 1 or with a limited maximum,
+more store is required for the compiled pattern, in
+proportion to the size of the minimum or maximum.
+@cindex Perl-style regular expressions, single line
+If a pattern starts with @code{.*} or @code{.@{0,@}} and the
+@code{S} modifier is used, the pattern is implicitly anchored,
+because whatever follows will be tried against every character
+position in the subject string, so there is no point in
+retrying the overall match at any position after the first.
+PCRE treats such a pattern as though it were preceded by \A.
+When a capturing subpattern is repeated, the value captured
+is the substring that matched the final iteration. For example,
+after
+@example
+     (tweedle[dume]@{3@}\s*)+
+@end example
+@noindent
+has matched @samp{tweedledum tweedledee} the value of the
+captured substring is @samp{tweedledee}.  However, if there are
+nested capturing subpatterns, the corresponding captured
+values may have been set in previous iterations. For example,
+after
+@example
+     /(a|(b))+/
+@end example
+matches @samp{aba}, the value of the second captured substring is
+@samp{b}.
+@node Backreferences
+@appendixsec Backreferences
+@cindex Perl-style regular expressions, backreferences
+Outside a character class, a backslash followed by a digit
+greater than 0 (and possibly further digits) is a back
+reference to a capturing subpattern earlier (i.e.  to its
+left) in the pattern, provided there have been that many
+previous capturing left parentheses.
+However, if the decimal number following the backslash is
+less than 10, it is always taken as a back reference, and
+causes an error only if there are not that many capturing
+left parentheses in the entire pattern. In other words, the
+parentheses that are referenced need not be to the left of
+the reference for numbers less than 10. @ref{Backslash}
+for further details of the handling of digits following a backslash.
+A back reference matches whatever actually matched the capturing
+subpattern in the current subject string, rather than
+anything matching the subpattern itself. So the pattern
+@example
+     (sens|respons)e and \1ibility
+@end example
+@noindent
+matches @samp{sense and sensibility} and @samp{response and responsibility},
+but not @samp{sense and responsibility}. If caseful
+matching is in force at the time of the back reference, the
+case of letters is relevant. For example,
+@example
+     ((?i)blah)\s+\1
+@end example
+@noindent
+matches @samp{blah blah} and @samp{Blah Blah}, but not
+@samp{BLAH blah}, even though the original capturing
+subpattern is matched caselessly.
+There may be more than one back reference to the same subpattern.
+Also, if a subpattern has not actually been used in a
+particular match, any back references to it always fail. For
+example, the pattern
+@example
+     (a|(bc))\2
+@end example
+@noindent
+always fails if it starts to match @samp{a} rather than
+@samp{bc}.  Because there may be up to 99 back references, all
+digits following the backslash are taken as part of a potential
+back reference number; this is different from what happens
+in @sc{posix} mode. If the pattern continues with a digit
+character, some delimiter must be used to terminate the back
+reference.  If the @code{X} modifier option is set, this can be
+whitespace.  Otherwise an empty comment can be used, or the
+following character can be expressed in hexadecimal or octal.
+A back reference that occurs inside the parentheses to which
+it refers fails when the subpattern is first used, so, for
+example, @code{(a\1)} never matches.  However, such references
+can be useful inside repeated subpatterns. For example, the
+pattern
+@example
+     (a|b\1)+
+@end example
+@noindent
+matches any number of @samp{a}s and also @samp{aba}, @samp{ababbaa},
+etc. At each iteration of the subpattern, the back reference matches
+the character string corresponding to the previous iteration.  In
+order for this to work, the pattern must be such that the first
+iteration does not need to match the back reference.  This can be
+done using alternation, as in the example above, or by a
+quantifier with a minimum of zero.
+@node Assertions
+@appendixsec Assertions
+@cindex Perl-style regular expressions, assertions
+@cindex Perl-style regular expressions, asserting subpatterns
+An assertion is a test on the characters following or
+preceding the current matching point that does not actually
+consume any characters. The simple assertions coded as @code{\b},
+@code{\B}, @code{\A}, @code{\Z}, @code{\z}, @code{^} and @code{$}
+are described above. More complicated assertions are coded as
+subpatterns.  There are two kinds: those that look ahead of the
+current position in the subject string, and those that look behind it.
+@cindex Perl-style regular expressions, lookahead subpatterns
+An assertion subpattern is matched in the normal way, except
+that it does not cause the current matching position to be
+changed. Lookahead assertions start with @code{(?=} for positive
+assertions and @code{(?!} for negative assertions. For example,
+@example
+     \w+(?=;)
+@end example
+@noindent
+matches a word followed by a semicolon, but does not include
+the semicolon in the match, and
+@example
+     foo(?!bar)
+@end example
+@noindent
+matches any occurrence of @samp{foo} that is not followed by
+@samp{bar}.
+Note that the apparently similar pattern
+@example
+     (?!foo)bar
+@end example
+@noindent
+@cindex Perl-style regular expressions, lookbehind subpatterns
+finds any occurrence of @samp{bar} even if it is preceded by
+@samp{foo}, because the assertion @code{(?!foo)} is always true
+when the next three characters are @samp{bar}. A lookbehind
+assertion is needed to achieve this effect.
+Lookbehind assertions start with @code{(?<=} for positive
+assertions and @code{(?<!} for negative assertions. So,
+@example
+     (?<!foo)bar
+@end example
+achieves the required effect of finding an occurrence of
+@samp{bar} that is not preceded by @samp{foo}. The contents of a
+lookbehind assertion are restricted
+such that all the strings it matches must have a fixed
+length.  However, if there are several alternatives, they do
+not all have to have the same fixed length.  This is an extension
+compared with Perl 5.005, which requires all branches to match
+the same length of string. Thus
+@example
+     (?<=dogs|cats|)
+@end example
+@noindent
+is permitted, but the apparently equivalent regular expression
+@example
+     (?<!dogs?|cats?)
+@end example
+@noindent
+causes an error at compile time. Branches that match different
+length strings are permitted only at the top level of
+a lookbehind assertion: an assertion such as
+@example
+     (?<=ab(c|de))
+@end example
+@noindent
+is not permitted, because its single top-level branch can
+match two different lengths, but it is acceptable if rewritten
+to use two top-level branches:
+@example
+     (?<=abc|abde)
+@end example
+All this is required because lookbehind assertions simply
+move the current position back by the alternative's fixed
+width and then try to match.  If there are
+insufficient characters before the current position, the
+match is deemed to fail.  Lookbehinds, in conjunction with
+non-backtracking subpatterns can be particularly useful for
+matching at the ends of strings; an example is given at the end
+of the section on non-backtracking subpatterns.
+Several assertions (of any sort) may occur in succession.
+For example,
+@example
+     (?<=\d@{3@})(?<!999)foo
+@end example
+@noindent
+matches @samp{foo} preceded by three digits that are not @samp{999}.
+Notice that each of the assertions is applied independently
+at the same point in the subject string. First there is a
+check that the previous three characters are all digits, and
+then there is a check that the same three characters are not
+@samp{999}.  This pattern does not match @samp{foo} preceded by six
+characters, the first of which are digits and the last three
+of which are not @samp{999}.  For example, it doesn't match
+@samp{123abcfoo}. A pattern to do that is
+@example
+     (?<=\d@{3@}...)(?<!999)foo
+@end example
+@noindent
+This time the first assertion looks at the preceding six
+characters, checking that the first three are digits, and
+then the second assertion checks that the preceding three
+characters are not @samp{999}.  Actually, assertions can be
+nested in any combination, so one can write this as
+@example
+     (?<=\d@{3@}(?!999)...)foo
+@end example
+or
+@example
+     (?<=\d@{3@}...(?<!999))foo
+@end example
+@noindent
+both of which might be considered more readable.
+Assertion subpatterns are not capturing subpatterns, and may
+not be repeated, because it makes no sense to assert the
+same thing several times. If any kind of assertion contains
+capturing subpatterns within it, these are counted for the
+purposes of numbering the capturing subpatterns in the whole
+pattern.  However, substring capturing is carried out only
+for positive assertions, because it does not make sense for
+negative assertions.
+Assertions count towards the maximum of 200 parenthesized
+subpatterns.
+@node Non-backtracking subpatterns
+@appendixsec Non-backtracking subpatterns
+@cindex Perl-style regular expressions, non-backtracking subpatterns
+With both maximizing and minimizing repetition, failure of
+what follows normally causes the repeated item to be evaluated
+again to see if a different number of repeats allows the
+rest of the pattern to match. Sometimes it is useful to
+prevent this, either to change the nature of the match, or
+to cause it fail earlier than it otherwise might, when the
+author of the pattern knows there is no point in carrying
+on.
+Consider, for example, the pattern @code{\d+foo} when applied to
+the subject line
+@example
+bar
+@end example
+After matching all 6 digits and then failing to match @samp{foo},
+the normal action of the matcher is to try again with only 5
+digits matching the @code{\d+} item, and then with 4, and so on,
+before ultimately failing. Non-backtracking subpatterns
+provide the means for specifying that once a portion of the
+pattern has matched, it is not to be re-evaluated in this way,
+so the matcher would give up immediately on failing to match
+@samp{foo} the first time.  The notation is another kind of special
+parenthesis, starting with @code{(?>} as in this example:
+@example
+     (?>\d+)bar
+@end example
+This kind of parenthesis ``locks up'' the part of the pattern
+it contains once it has matched, and a failure further into
+the pattern is prevented from backtracking into it.
+Backtracking past it to previous items, however, works as
+normal.
+Non-backtracking subpatterns are not capturing subpatterns.  Simple
+cases such as the above example can be thought of as a maximizing
+repeat that must swallow everything it can.  So,
+while both @code{\d+} and @code{\d+?} are prepared to adjust the number of
+digits they match in order to make the rest of the pattern
+match, @code{(?>\d+)} can only match an entire sequence of digits.
+This construction can of course contain arbitrarily complicated
+subpatterns, and it can be nested.
+@cindex Perl-style regular expressions, lookbehind subpatterns
+Non-backtracking subpatterns can be used in conjunction with look-behind
+assertions to specify efficient matching at the end
+of the subject string. Consider a simple pattern such as
+@example
+     abcd$
+@end example
+@noindent
+when applied to a long string which does not match.  Because
+matching proceeds from left to right, @command{sed} will look for
+each @samp{a} in the subject and then see if what follows matches
+the rest of the pattern. If the pattern is specified as
+@example
+     ^.*abcd$
+@end example
+@noindent
+the initial @code{.*} matches the entire string at first, but when
+this fails (because there is no following @samp{a}), it backtracks
+to match all but the last character, then all but the
+last two characters, and so on. Once again the search for
+@samp{a} covers the entire string, from right to left, so we are
+no better off. However, if the pattern is written as
+@example
+     ^(?>.*)(?<=abcd)
+@end example
+there can be no backtracking for the .* item; it can match
+only the entire string. The subsequent lookbehind assertion
+does a single test on the last four characters. If it fails,
+the match fails immediately. For long strings, this approach
+makes a significant difference to the processing time.
+When a pattern contains an unlimited repeat inside a subpattern
+that can itself be repeated an unlimited number of
+times, the use of a once-only subpattern is the only way to
+avoid some failing matches taking a very long time
+indeed.@footnote{Actually, the matcher embedded in @value{SSED}
+    tries to do something for this in the simplest cases,
+    like @code{([^b]*b)*}.  These cases are actually quite
+    common: they happen for example in a regular expression
+    like @code{\/\*([^*]*\*)*\/} which matches C comments.}
+The pattern
+@example
+     (\D+|<\d+>)*[!?]
+@end example
+([^0-9<]+<(\d+>)?)*[!?]
+@noindent
+matches an unlimited number of substrings that either consist
+of non-digits, or digits enclosed in angular brackets, followed by
+an exclamation or question mark. When it matches, it runs quickly.
+However, if it is applied to
+@example
+     aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
+@end example
+@noindent
+it takes a long time before reporting failure.  This is
+because the string can be divided between the two repeats in
+a large number of ways, and all have to be tried.@footnote{The
+example used @code{[!?]} rather than a single character at the end,
+because both @value{SSED} and Perl have an optimization that allows
+for fast failure when a single character is used. They
+remember the last single character that is required for a
+match, and fail early if it is not present in the string.}
+If the pattern is changed to
+@example
+     ((?>\D+)|<\d+>)*[!?]
+@end example
+sequences of non-digits cannot be broken, and failure happens
+quickly.
+@node Conditional subpatterns
+@appendixsec Conditional subpatterns
+@cindex Perl-style regular expressions, conditional subpatterns
+It is possible to cause the matching process to obey a subpattern
+conditionally or to choose between two alternative
+subpatterns, depending on the result of an assertion, or
+whether a previous capturing subpattern matched or not. The
+two possible forms of conditional subpattern are
+@example
+     (?(@var{condition})@var{yes-pattern})
+     (?(@var{condition})@var{yes-pattern}|@var{no-pattern})
+@end example
+If the condition is satisfied, the yes-pattern is used; otherwise
+the no-pattern (if present) is used. If there are more than two
+alternatives in the subpattern, a compile-time error occurs.
+There are two kinds of condition. If the text between the
+parentheses consists of a sequence of digits, the condition
+is satisfied if the capturing subpattern of that number has
+previously matched.  The number must be greater than zero.
+Consider the following pattern, which contains non-significant
+white space to make it more readable (assume the @code{X} modifier)
+and to divide it into three parts for ease of discussion:
+@example
+     ( \( )?   [^()]+   (?(1) \) )
+@end example
+The first part matches an optional opening parenthesis, and
+if that character is present, sets it as the first captured
+substring. The second part matches one or more characters
+that are not parentheses. The third part is a conditional
+subpattern that tests whether the first set of parentheses
+matched or not.  If they did, that is, if subject started
+with an opening parenthesis, the condition is true, and so
+the yes-pattern is executed and a closing parenthesis is
+required. Otherwise, since no-pattern is not present, the
+subpattern matches nothing.  In other words, this pattern
+matches a sequence of non-parentheses, optionally enclosed
+in parentheses.
+@cindex Perl-style regular expressions, lookahead subpatterns
+If the condition is not a sequence of digits, it must be an
+assertion.  This may be a positive or negative lookahead or
+lookbehind assertion. Consider this pattern, again containing
+non-significant white space, and with the two alternatives
+on the second line:
+@example
+     (?(?=...[a-z])
+        \d\d-[a-z]@{3@}-\d\d |
+        \d\d-\d\d-\d\d )
+@end example
+The condition is a positive lookahead assertion that matches
+a letter that is three characters away from the current point.
+If a letter is found, the subject is matched against the first
+alternative @samp{@var{dd}-@var{aaa}-@var{dd}} (where @var{aaa} are
+letters and @var{dd} are digits); otherwise it is matched against
+the second alternative, @samp{@var{dd}-@var{dd}-@var{dd}}.
+@node Recursive patterns
+@appendixsec Recursive patterns
+@cindex Perl-style regular expressions, recursive patterns
+@cindex Perl-style regular expressions, recursion
+Consider the problem of matching a string in parentheses,
+allowing for unlimited nested parentheses. Without the use
+of recursion, the best that can be done is to use a pattern
+that matches up to some fixed depth of nesting. It is not
+possible to handle an arbitrary nesting depth. Perl 5.6 has
+provided an experimental facility that allows regular
+expressions to recurse (amongst other things). It does this
+by interpolating Perl code in the expression at run time,
+and the code can refer to the expression itself. A Perl pattern
+tern to solve the parentheses problem can be created like
+this:
+@example
+     $re = qr@{\( (?: (?>[^()]+) | (?p@{$re@}) )* \)@}x;
+@end example
+The @code{(?p@{...@})} item interpolates Perl code at run time,
+and in this case refers recursively to the pattern in which it
+appears. Obviously, @command{sed} cannot support the interpolation of
+Perl code.  Instead, the special item @code{(?R)} is provided for
+the specific case of recursion. This pattern solves the
+parentheses problem (assume the @code{X} modifier option is used
+so that white space is ignored):
+@example
+     \( ( (?>[^()]+) | (?R) )* \)
+@end example
+First it matches an opening parenthesis. Then it matches any
+number of substrings which can either be a sequence of
+non-parentheses, or a recursive match of the pattern itself
+(i.e. a correctly parenthesized substring). Finally there is
+a closing parenthesis.
+This particular example pattern contains nested unlimited
+repeats, and so the use of a non-backtracking subpattern for
+matching strings of non-parentheses is important when applying
+the pattern to strings that do not match. For example, when
+it is applied to
+@example
+     (aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa()
+@end example
+it yields a ``no match'' response quickly. However, if a
+standard backtracking subpattern is not used, the match runs
+for a very long time indeed because there are so many different
+ways the @code{+} and @code{*} repeats can carve up the subject,
+and all have to be tested before failure can be reported.
+The values set for any capturing subpatterns are those from
+the outermost level of the recursion at which the subpattern
+value is set. If the pattern above is matched against
+@example
+     (ab(cd)ef)
+@end example
+@noindent
+the value for the capturing parentheses is @samp{ef}, which is
+the last value taken on at the top level.
+@node Comments
+@appendixsec Comments
+@cindex Perl-style regular expressions, comments
+The sequence (?# marks the start of a comment which continues
+ues up to the next closing parenthesis. Nested parentheses
+are not permitted. The characters that make up a comment
+play no part in the pattern matching at all.
+@cindex Perl-style regular expressions, extended
+If the @code{X} modifier option is used, an unescaped @code{#} character
+outside a character class introduces a comment that continues
+up to the next newline character in the pattern.
+@end ifset
+@page
+@node GNU Free Documentation License
+@appendix GNU Free Documentation License
+@include fdl.texi

trunk/src/sed/doc/sed.x

-              r599
+              r3613
+.SH NAME
 sed \- a Stream EDitor
+.SH SYNOPSIS
+[NAME]
+sed \- stream editor for filtering and transforming text
+[SYNOPSIS]
 .nf
 sed [-V] [--version] [--help] [-n] [--quiet] [--silent]
     [-l N] [--line-length=N] [-u] [--unbuffered]
     [-r] [--regexp-extended]
+    [-E] [-r] [--regexp-extended]
     [-e script] [--expression=script]
     [-f script-file] [--file=script-file]
 …
 .RI # comment
 The comment extends until the next newline (or the end of a
 .B -e
+.B \-e
 script fragment).
 .TP
 …
 which has each embedded newline preceded by a backslash.
 .TP
+q
+q [\fIexit-code\fR]
 Immediately quit the \*(sd script without processing
 any more input,
+except that if auto-print is not disabled
 the current pattern space will be printed.
 .TP
+Q
+any more input, except that if auto-print is not disabled
+the current pattern space will be printed.  The exit code
+argument is a GNU extension.
+.TP
+Q [\fIexit-code\fR]
 Immediately quit the \*(sd script without processing
 any more input.
+any more input.  This is a GNU extension.
 .TP
 .RI r\  filename
 …
 Append a line read from
 .IR filename .
+Each invocation of the command reads a line from the file.
+This is a GNU extension.
 .SS
 Commands which accept address ranges
 …
 is omitted, branch to end of script.
 .TP
-.RI t\  label
-If a s/// has done a successful substitution since the
-last input line was read and since the last t or T
-command, then branch to
-.IR label ;
-if
-.I label
-is omitted, branch to end of script.
-.TP
-.RI T\  label
-If no s/// has done a successful substitution since the
-last input line was read and since the last t or T
-command, then branch to
-.IR label ;
-if
-.I label
-is omitted, branch to end of script.
-.TP
 c \e
 .TP
 …
 .TP
+D
+Delete up to the first embedded newline in the pattern space.
+Start next cycle, but skip reading from the input
+if there is still data in the pattern space.
+If pattern space contains no newline, start a normal new cycle as if
+the d command was issued.  Otherwise, delete text in the pattern
+space up to the first newline, and restart cycle with the resultant
+pattern space, without reading a new line of input.
 .TP
 h H
 …
 Copy/append hold space to pattern space.
 .TP
+x
-Exchange the contents of the hold and pattern spaces.
-.TP
+l
 List out the current line in a ``visually unambiguous'' form.
+.TP
+.RI l\  width
+List out the current line in a ``visually unambiguous'' form,
+breaking it at
+.I width
+characters.  This is a GNU extension.
 .TP
 n N
 …
 .IR regexp .
 .TP
+.RI t\  label
+If a s/// has done a successful substitution since the
+last input line was read and since the last t or T
+command, then branch to
+.IR label ;
+if
+.I label
+is omitted, branch to end of script.
+.TP
+.RI T\  label
+If no s/// has done a successful substitution since the
+last input line was read and since the last t or T
+command, then branch to
+.IR label ;
+if
+.I label
+is omitted, branch to end of script.  This is a GNU
+extension.
+.TP
 .RI w\  filename
 Write the current pattern space to
 …
 Write the first line of the current pattern space to
 .IR filename .
+This is a GNU extension.
+.TP
+x
+Exchange the contents of the hold and pattern spaces.
 .TP
 .RI y/ source / dest /
 …
 .I number
 Match only the specified line
+.IR number .
+.IR number
+(which increments cumulatively across files, unless the
+.B \-s
+option is specified on the command line).
 .TP
 .IR first ~ step
 …
 line starting with line
 .IR first .
 For example, ``sed -n 1~2p'' will print all the odd-numbered lines in
+For example, ``sed \-n 1~2p'' will print all the odd-numbered lines in
 the input stream, and the address 2~5 will match every fifth line,
+starting with the second. (This is an extension.)
+starting with the second.
+.I first
+can be zero; in this case, \*(sd operates as if it were equal to
+.IR step .
+(This is an extension.)
 .TP
+$
 …
 Match lines matching the regular expression
 .IR regexp .
+Matching is performed on the current pattern space, which
+can be modified with commands such as ``s///''.
 .TP
 .BI \fR\e\fPc regexp c
 …
 .RI 1, addr2
 form will still be at the beginning of its range.
+This works only when
+.I addr2
+is a regular expression.
 .TP
 .IR addr1 ,+ N
 …
 .BR \et ,
 and other sequences.
+The \fI-E\fP option switches to using extended regular expressions instead;
+it has been supported for years by GNU sed, and is now
+included in POSIX.
 [SEE ALSO]
 …
 .PP
 E-mail bug reports to
+.BR bonzini@gnu.org .
+Be sure to include the word ``sed'' somewhere in the ``Subject:'' field.
+Also, please include the output of ``sed --version'' in the body
+.BR bug-sed@gnu.org .
+Also, please include the output of ``sed \-\-version'' in the body
 of your report if at all possible.

trunk/src/sed/doc/stamp-vti

-              r599
+              r3613
 @set UPDATED 30 January 2006
 @set UPDATED-MONTH January 2006
 @set EDITION 4.1.5
 @set VERSION 4.1.5
+@set UPDATED 1 January 2022
+@set UPDATED-MONTH January 2022
+@set EDITION 4.9
+@set VERSION 4.9

trunk/src/sed/doc/version.texi

-              r599
+              r3613
 @set UPDATED 30 January 2006
 @set UPDATED-MONTH January 2006
 @set EDITION 4.1.5
 @set VERSION 4.1.5
+@set UPDATED 1 January 2022
+@set UPDATED-MONTH January 2022
+@set EDITION 4.9
+@set VERSION 4.9

Note: See TracChangeset for help on using the changeset viewer.

Context Navigation

Changeset 3613 for trunk/src/sed/doc

Legend:

Download in other formats: