Context Navigation

BUGS

Visit:

Last change on this file was 3083, checked in by bird, 18 years ago
sed 4.1.5
File size: 5.3 KB

Rev	Line
[3083]	1	* ABOUT BUGS
	2
	3	Before reporting a bug, please check the list of known bugs
	4	and the list of oft-reported non-bugs (below).
	5
	6	Bugs and comments may be sent to bonzini@gnu.org; please
	7	include in the Subject: header the first line of the output of
	8	``sed --version''.
	9
	10	Please do not send a bug report like this:
	11
	12	[while building frobme-1.3.4]
	13	$ configure
	14	sed: file sedscr line 1: Unknown option to 's'
	15
	16	If sed doesn't configure your favorite package, take a few extra
	17	minutes to identify the specific problem and make a stand-alone test
	18	case.
	19
	20	A stand-alone test case includes all the data necessary to perform the
	21	test, and the specific invocation of sed that causes the problem. The
	22	smaller a stand-alone test case is, the better. A test case should
	23	not involve something as far removed from sed as ``try to configure
	24	frobme-1.3.4''. Yes, that is in principle enough information to look
	25	for the bug, but that is not a very practical prospect.
	26
	27
	28
	29	* NON-BUGS
	30
	31	`N' command on the last line
	32
	33	Most versions of sed exit without printing anything when the `N'
	34	command is issued on the last line of a file. GNU sed instead
	35	prints pattern space before exiting unless of course the `-n'
	36	command switch has been specified. More information on the reason
	37	behind this choice can be found in the Info manual.
	38
	39
	40	regex syntax clashes (problems with backslashes)
	41
	42	sed uses the Posix basic regular expression syntax. According to
	43	the standard, the meaning of some escape sequences is undefined in
	44	this syntax; notable in the case of GNU sed are `\\|', `\+', `\?',
	45	`\`', `\'', `\<', `\>', `\b', `\B', `\w', and `\W'.
	46
	47	As in all GNU programs that use Posix basic regular expressions, sed
	48	interprets these escape sequences as meta-characters. So, `x\+'
	49	matches one or more occurrences of `x'. `abc\\|def' matches either
	50	`abc' or `def'.
	51
	52	This syntax may cause problems when running scripts written for other
	53	seds. Some sed programs have been written with the assumption that
	54	`\\|' and `\+' match the literal characters `\|' and `+'. Such scripts
	55	must be modified by removing the spurious backslashes if they are to
	56	be used with recent versions of sed (not only GNU sed).
	57
	58	On the other hand, some scripts use `s\|abc\\|def\|\|g' to remove occurrences
	59	of _either_ `abc' or `def'. While this worked until sed 4.0.x, newer
	60	versions interpret this as removing the string `abc\|def'. This is
	61	again undefined behavior according to POSIX, but this interpretation
	62	is arguably more robust: the older one, for example, required that
	63	the regex matcher parsed `\/' as `/' in the common case of escaping
	64	a slash, which is again undefined behavior; the new behavior avoids
	65	this, and this is good because the regex matcher is only partially
	66	under our control.
	67
	68	In addition, GNU sed supports several escape characters (some of
	69	which are multi-character) to insert non-printable characters
	70	in scripts (`\a', `\c', `\d', `\o', `\r', `\t', `\v', `\x'). These
	71	can cause similar problems with scripts written for other seds.
	72
	73
	74	-i clobbers read-only files
	75
	76	In short, `sed d -i' will let one delete the contents of
	77	a read-only file, and in general the `-i' option will let
	78	one clobber protected files. This is not a bug, but rather a
	79	consequence of how the Unix filesystem works.
	80
	81	The permissions on a file say what can happen to the data
	82	in that file, while the permissions on a directory say what can
	83	happen to the list of files in that directory. `sed -i'
	84	will not ever open for writing a file that is already on disk,
	85	rather, it will work on a temporary file that is finally renamed
	86	to the original name: if you rename or delete files, you're actually
	87	modifying the contents of the directory, so the operation depends on
	88	the permissions of the directory, not of the file). For this same
	89	reason, sed will not let one use `-i' on a writeable file in a
	90	read-only directory (but unbelievably nobody reports that as a
	91	bug...).
	92
	93
	94	`0a' does not work (gives an error)
	95
	96	There is no line 0. 0 is a special address that is only used to treat
	97	addresses like `0,/RE/' as active when the script starts: if you
	98	write `1,/abc/d' and the first line includes the word `abc', then
	99	that match would be ignored because address ranges must span at least
	100	two lines (barring the end of the file); but what you probably wanted is
	101	to delete every line up to the first one including `abc', and this
	102	is obtained with `0,/abc/d'.
	103
	104
	105	`[a-z]' is case insensitive
	106
	107	You are encountering problems with locales. POSIX mandates that `[a-z]'
	108	uses the current locale's collation order -- in C parlance, that means
	109	strcoll(3) instead of strcmp(3). Some locales have a case insensitive
	110	strcoll, others don't: one of those that have problems is Estonian.
	111
	112	Another problem is that [a-z] tries to use collation symbols. This
	113	only happens if you are on the GNU system, using GNU libc's regular
	114	expression matcher instead of compiling the one supplied with GNU sed.
	115	In a Danish locale, for example, the regular expression `^[a-z]$'
	116	matches the string `aa', because aa is a single collating symbol that
	117	comes after `a' and before `b'; `ll' behaves similarly in Spanish
	118	locales, or `ij' in Dutch locales.
	119
	120	To work around these problems, which may cause bugs in shell scripts,
	121	set the LC_ALL environment variable to `C', or set the locale on a
	122	more fine-grained basis with the other LC_* environment variables.

Note: See TracBrowser for help on using the repository browser.

Download in other formats:

Original Format