Coreutils - rejected feature requests

Some of the hardest work on coreutils is knowing what to reject and providing appropriate justification to the contributors.

The contributions below while all good ideas, were not included for various reasons detailed on the linked mailing list discussions.

cat chmod cp cut date dd df du join ls mv rm shred sort stat *sum touch uniq wc misc New commands

cat

cat --timestamp. awk or perl is good enough for this
cat -n alternate formats. Manipulation with existing tools supports this better
cat --show-ends to highlight trailing whitespace. grep --color was deemed better/sufficient
cat --header to output filenames for each file. tail -n+1 does this already
cat -S to squeeze lines just containing blank chars. Existing tools like `sed 's/^ *$//' | cat -s` were thought sufficient
cat -d,--direct to use direct I/O. dd has the 'nocache' or 'direct' options and there are general nocache wrappers

chmod

chmod maintains ctime when permissions unchanged. The proposed patch was deemed inefficient
chmod -d to set perms on just directories. The 'X' mode, or `find` with `chmod` was deemed sufficient
chmod +S to set setgid on just directories. `find` in combination with `chmod` was deemed sufficient
chmod -D to set perms on just directories. The 'X' mode, or `find` with `chmod` was deemed sufficient
chmod --parents. Doing this with a simple script or with find was deemed sufficient
chmod --umask. The existing chmod options were deemed sufficient
chmod b10111. Binary conversion can be done easily in bash or ksh
disallow chmod to create world writable files. This couldn't be general, and even so could be easily bypassed

cp

cp,mv --to. Though better than --target it wasn't warranted to have two ways to do it
unicode symbols for cp,mv --verbose. Rejected for same reasons as UTF-8 arrows in ls output
distinguished symbols for cp,mv --verbose. It was deemed there was enough info from the context
cp,mv --progress. Existing tools cater for this already
cp --reflink-range=src_offset,src_length,dst_offset. The was not deemed warranted
cp --quiet. Suppressing ENOENT errors can be done by filtering existing files first
cp --resume. Use rsync
cp --parallel. While this helps copy speed in some situations, the fix is probably best handled at a lower level
cp --preserve=all should copy ext2 extended attributes. Would need file system agnostic interface (like copyfile())
cp,mv --bwlimit to throttle data transfer rates. This is better suited to higher level tools, and is available in rsync

cut

cut -d '[:blank:]'. The exisiting tr -s '[:blank:]' ' ' | cut -d ' ' was deemed sufficient. Note
cut -d 'string'. sed 's/string/\x00/g' | cut -d '' was deemed sufficient
cut --output-delimiter short option. One can already do cut --ou
cut --csv. A separate util was deemed best for this complicated task
cut -C. There was no need for this alias for --complement (--co)
cut -f2,1 to reorder fields. Using awk or join is deemed sufficient
cut --separator to specify the “line” delimiter. Pre/Post-processing with tr was deemed sufficient

date

date +%f to flush output. `stdbuf -oL date ...` was deemed sufficient
date should parse 'DAY MONTH, YEAR' format. The format was deemed erroneous
date +%J to support astronomical julian date. This was not thought common enough to support
date -v to provide BSD syntax relative date adjustments. The existing GNU relative date syntax was thought sufficient

dd

I/O throughput limitation for dd. pv and rsync et. al. were deemed better for this
dd --limit-speed. It was thought best to leave this to tools like pv or trickle
dd conv=noerror should apply to writes as well as reads. shred is best used for this use case
dd iflag=seekable oflag=seekable to verify lseek(2) support. The feature was not deemed useful/complete enough
dd conv=offload to offload copying to various backends. This was thought too specialized to support explicitly
dd conv=truncpost. To support filtering files in place. Due to error handling this was not thought useful enough

df

df,du -g. Specifying “Gigabyte” output format is neither standard or required
df autoscale. df -h was thought good enough
df -g. Separate options for various output units is best avoided
df --without-header. --header options are only really useful for data consumers
df --dereference to process symlink targets. df was changed to reference symlink targets unconditionally

du

du --format to allow sorting. sort -h handles this
accurate du results for OCFS2 reflinked files. It was thought too complicated and specific to add to du
du --exclude-dirs to exclude directories themselves from the usage count. find .. | du was thought sufficient
du --sort to sort by disk usage. This is already supported directly by du -h | sort -h

join

Auto detect output format for join. We need to consider further whether this is useful
join more than two files. It would add complexity while not being scalable
join more than one field. There wasn't much interest in this
comm,join --parallel to use multiple cores. It was thought best to split the data for multiple processes
join -t '\t' to use a TAB delimiter. Using the shell to specify the TAB char like join -t $'\t', was thought ubiquitous enough

ls

Change/revert the default quoting style - please see the ls quotes page for details.
UTF-8 arrows in ls. See this l script as an alternative
df/ls --blocksize={decimal,binary}. Though more correct, it was deemed overkill
ls --sort=class. Sorting by type indicator was deemed of marginal benefit
ls --octal to output octal permissions. Using stat or find is deemed sufficient
ls --group-numbers=locale to output thousands separators. BLOCK_SIZE or numfmt were deemed sufficient
Suppress trailing slash with ls -F /. The result was deemed too inconsistent
ls --just=$filetype to limit file type listed. Filtering classify tags like `ls --color -lF | sed -n 's#/$##p'` was thought sufficient
ls -z,--zero to NUL terminate entries. Existing tools like find(1) were thought sufficient
ls --sort=inode. It was though find ... | sort was more appropriate for this low level functionality

mv

mv -p (create target dir). It was thought more functional to just `mkdir -p` first
mv --symbolic-link. It was thought that mv and ln --relative separately give more control
mv --swap. To swap two files, a shell script (prehaps provided by coreutils) would be best
mv --safe to only remove source on completion. `cp ... && rm` was deemed sufficient
mv --parents to recreate a hierarchy. Using cp -l --parents is deemed sufficient

rm

rm --parents. Deleting the opposite way up the tree was deemed too dangerous
rm -d. rmdir is equivalent and less confusing
rm --no-preserve-root. Adding protective prompts would not significantly improve security
rm -rf . to delete current directory. It was thought existing support for rm -rf "$PWD" suffices
rm -rf . to delete all files inside current directory. `rm -rf * .[!.] .??*` and `find . -delete` were deemed sufficient
rm -s to behave in a “smarter” fashion. `rm -I` or `find | xargs rm` were deemed sufficient
rm --exclude to exclude file names. Existing tools like find(1) were thought sufficient
rm should use remove(), leaving unlink() to the unlink command. We can't change such standardized functionality

shred

shred --recursive. Deemed better to explicitly select (with find for example)
shred -r. shred is of limited use with files anyway

sort

sort --by-length. A “sort by line length” example was added to the info docs
min,max commands. (the sort --range={} alternative seems useful though)
sort -V auto ignores white-space. One can do that more generally with -b
sort -I to sort IP addresses. It's debatable whether this is warranted
SORT_BUFFER_SIZE=1234 sort. env vars can be useful when shared by many commands, but are best avoided
sort to use /var/tmp by default. It was thought best to keep using /tmp as its tmp files are stateless
sort fixed width fields. This is already supported with: sort -d$'\n' -k1.5,1.9 ...
sort --header to exclude leading lines from the sort. sed, head, etc. were deemed sufficient

stat

stat --list-fstypes. The internally supported file system IDs were thought best not exposed
stat --files0-from=FILE. This is only needed for commands needing to process all arguments in a single invocation
stat --digest-type=WORD. It was thought better to use the existing checksum utils and join the file names etc. separately
stat --quoting-style=WORD. Adjustments to --format='%N' were thought more appropriate

*sum

md5sum --threads. The UNIX toolkit already handles processing files in parallel
md5sum --base32. There was little interest in this Internet Archive specific functionality
configurable md5sum buffer size. It was thought better to use NFS parameters to minimize network latency, or the stdbuf utility to control the buffering more generally
md5sum --threads. UNIX tools were deemed good enough to process separate files in parallel
sha1sum --raw | base64. `openssl dgst -sha1 -binary $file | openssl enc -base64` was deemed available enough
*sum --ignore-dirs. The use cases were seen as too limited
md5sum --pipe to output checksum to file and data to stdout. tee suffices for this
*sum --color to colorize the checksum to ease comparisons. This was thought more flexible to perform with separate tools
*sum --no-filename to only output the checksum. Postprocessing the output was deemed sufficient

touch

touch -R. `find . -exec touch -am {} +` is more general
touch --mode. Not deemed beneficial enough
touch --verbose. This could not be implemented robustly. Also xargs --verbose or (set -x; touch *) are sufficient
touch --create to only create files. `test -e file || touch file` was deemed sufficient

uniq

uniq --unsorted. This would add a lot of complexity that's already contained within sort
uniq --ignore-last-fields. `rev | uniq -f | rev` was deemed sufficient
uniq --acumulate. Adding values was thought too specific for coreutils and is available elsewhere
uniq --check-fields=N to only check N fields. uniq --key would be a more general solution
uniq -c --total. piping to awk '{t+=$1}END{print t,"total"}1' was deemed sufficient
uniq --regex to use a regular expression to match lines. Existing tools using a DSU pattern is more general

wc

wc --tab-width. Preprocessing with expand is more functional
wc -q to suppress the file name. Redirecting file to stdin is sufficient
wc --max-chars=N to filter out long lines. Existing filters like awk 'length($0) <= 3' were deemed more appropriate

misc

tr -0. Can do the same thing with marginally more tr syntax
--at options for commands. This functionality was not deemed required for shell
command --examples. This would need to be accepted into the GNU Coding Standards first
sleep --random. The existing tools to achieve this were deemed sufficient
remove chown,cpio user: shortcut. This was deemed useful and so retained
mktemp -tp. It was thought better to create a fifo in a temp dir, rather than a temp fifo directly
mktemp --fifo. This was not deemed warranted
hostname -b. Setting a default hostname is too platform dependent
truncate -s +50%. Percentage calculation was thought best handled outside of truncate
fold --indent. `fmt -t | sed 's/^ / /'` was deemed sufficient
fold --prefix to add a prefix to each line. fmt and/or sed are deemed sufficient
pr --fold to wrap lines. fold or fmt can do this before processing by pr
users --all to show even non logged in users. The system interfaces aren't general enough to support this
users -h to show help. --h{,elp} was deemed sufficient
groups -0 to support group names with spaces etc. Instead the more standard id -Gnz is provided
BLOCK_SIZE={binary,decimal} to add 'iB' and 'B' suffixes to various “human” numbers. numfmt can fill this role
test -ed. Using stat in a shell function was deemed sufficient
tac -z. tac -s $'\0' is equivalent
mkdir --reference to copy permissions. Using umasks and ACLs was deemed sufficient
mkdir -m ... --parents-mode to use the mode for all created dirs. This would provide any functional benefit over using chmod
chroot --before to determine UIDs outside the chroot. This is a bit specialized for inclusion
uname --distro. lsb_release --id was deemed sufficient
uname -i and -p should infer hardware info. It was thought better to just use the provided syscall info
split --balanced to balance lines across the last two buckets. split -nl/$num supports this better
rmdir -r. rm -r was deemed sufficient
rmdir --one-file-system. `rm` or `find` are more suitible for such edge-cases
realpath -t -b. Short options for existing long options were not seen as appropriate
readlink -f output/trailing/slash/. It's easy to add the '/' in shell if needed
expand --auto-tabs. It wasn't thought to give much benefit over just specifying --tabs
seq --format support for general printf formats. Prefixing etc. is best done outside of seq
head --read-all-input. It was thought adding more control to tee was a more general solution
echo -- -e to terminate options in the common way. This would violate POSIX. Use printf instead
expand --auto to auto determine tab stops. The operation would not be general enough
csplit '@1' to split when field 1 changes. uniq --group | csplit --suppress-matched was thought to be better
cplit --output=N to output only the Nth file. This was thought too specialized to support
ln --absolute to force absolute symlinks. It's easy to get absolute paths with realpath or $PWD
timeout setting a TIMEOUT env var. The use case is unusual and supported with explicitly setting vars with env etc.
yes -n to not output a '\n'. yes whatever | tr -d '\n' was thought sufficient
New environment variables are discouraged. Wrapper shell scripts and shell aliases are preferred.
See $LS_ARGS, $SORT_BUFFER_SIZE, $HUMAN_B, rm; Similarly, $GREP_OPTIONS and $GZIP.
Global configuration file /etc/gnu.conf. Aliases, shell functions and shell script wrappers are the recommended ways to modify default behaviour of coreutils programs.

New commands

a path manipulation command. Existing tools deemed rich enough
Add a sparse command. cp already supports creating sparse files
Add a getlimits command. This would not be standard enough outside of a particular project
quoted-printable. recode or perl can easily decode this format
An errno utility. A full C wrapper around strerror() was deemed overkill. Maybe we'll add a script to contrib/
where am i. `hostname; pwd` is fine
tableize. This new command was tought better done as a --border option to column -t
physmem. Programs to print mem info like `hwloc-info` and `free` are already available
Provide '0' and '1' utils. These were not seen to benefit shell syntax
cksum -a algo1 algo2 (like NetBSD) to do many checksums per read. It was thought more general to use separate processes
rename command (from util-linux). There are existing commands to do this, and adjusting for inclusion in coreutils was thought to create too much flux
testline command to expose bloom filter functionality. It was thought options to existing tools were more appropriate