Coreutils - rejected feature requests
Some of the hardest work on coreutils is knowing what to reject and providing appropriate justification to the contributors.
The contributions below while all good ideas, were not included for various reasons detailed on the linked mailing list discussions.
cat
- cat --timestamp. awk or perl is good enough for this
- cat -n alternate formats. Manipulation with existing tools supports this better
- cat --show-ends to highlight trailing whitespace. grep --color was deemed better/sufficient
- cat --header to output filenames for each file. tail -n+1 does this already
- cat -S to squeeze lines just containing blank chars. Existing tools like `sed 's/^ *$//' | cat -s` were thought sufficient
- cat -d,--direct to use direct I/O. dd has the 'nocache' or 'direct' options and there are general nocache wrappers
chmod
- chmod maintains ctime when permissions unchanged. The proposed patch was deemed inefficient
- chmod -d to set perms on just directories. The 'X' mode, or `find` with `chmod` was deemed sufficient
- chmod +S to set setgid on just directories. `find` in combination with `chmod` was deemed sufficient
- chmod -D to set perms on just directories. The 'X' mode, or `find` with `chmod` was deemed sufficient
- chmod --parents. Doing this with a simple script or with find was deemed sufficient
- chmod --umask. The existing chmod options were deemed sufficient
- chmod b10111. Binary conversion can be done easily in bash or ksh
- disallow chmod to create world writable files. This couldn't be general, and even so could be easily bypassed
cp
- cp,mv --to. Though better than --target it wasn't warranted to have two ways to do it
- unicode symbols for cp,mv --verbose. Rejected for same reasons as UTF-8 arrows in ls output
- distinguished symbols for cp,mv --verbose. It was deemed there was enough info from the context
- cp,mv --progress. Existing tools cater for this already
- cp --reflink-range=src_offset,src_length,dst_offset. The was not deemed warranted
- cp --quiet. Suppressing ENOENT errors can be done by filtering existing files first
- cp --resume. Use rsync
- cp --parallel. While this helps copy speed in some situations, the fix is probably best handled at a lower level
- cp --preserve=all should copy ext2 extended attributes. Would need file system agnostic interface (like copyfile())
- cp,mv --bwlimit to throttle data transfer rates. This is better suited to higher level tools, and is available in rsync
cut
- cut -d '[:blank:]'. The exisiting tr -s '[:blank:]' ' ' | cut -d ' ' was deemed sufficient. Note
- cut -d 'string'. sed 's/string/\x00/g' | cut -d '' was deemed sufficient
- cut --output-delimiter short option. One can already do cut --ou
- cut --csv. A separate util was deemed best for this complicated task
- cut -C. There was no need for this alias for --complement (--co)
- cut -f2,1 to reorder fields. Using awk or join is deemed sufficient
- cut --separator to specify the “line” delimiter. Pre/Post-processing with tr was deemed sufficient
date
- date +%f to flush output. `stdbuf -oL date ...` was deemed sufficient
- date should parse 'DAY MONTH, YEAR' format. The format was deemed erroneous
- date +%J to support astronomical julian date. This was not thought common enough to support
- date -v to provide BSD syntax relative date adjustments. The existing GNU relative date syntax was thought sufficient
dd
- I/O throughput limitation for dd. pv and rsync et. al. were deemed better for this
- dd --limit-speed. It was thought best to leave this to tools like pv or trickle
- dd conv=noerror should apply to writes as well as reads. shred is best used for this use case
- dd iflag=seekable oflag=seekable to verify lseek(2) support. The feature was not deemed useful/complete enough
- dd conv=offload to offload copying to various backends. This was thought too specialized to support explicitly
- dd conv=truncpost. To support filtering files in place. Due to error handling this was not thought useful enough
df
- df,du -g. Specifying “Gigabyte” output format is neither standard or required
- df autoscale. df -h was thought good enough
- df -g. Separate options for various output units is best avoided
- df --without-header. --header options are only really useful for data consumers
- df --dereference to process symlink targets. df was changed to reference symlink targets unconditionally
du
- du --format to allow sorting. sort -h handles this
- accurate du results for OCFS2 reflinked files. It was thought too complicated and specific to add to du
- du --exclude-dirs to exclude directories themselves from the usage count. find .. | du was thought sufficient
- du --sort to sort by disk usage. This is already supported directly by du -h | sort -h
join
- Auto detect output format for join. We need to consider further whether this is useful
- join more than two files. It would add complexity while not being scalable
- join more than one field. There wasn't much interest in this
- comm,join --parallel to use multiple cores. It was thought best to split the data for multiple processes
- join -t '\t' to use a TAB delimiter. Using the shell to specify the TAB char like join -t $'\t', was thought ubiquitous enough
ls
- Change/revert the default quoting style - please see the ls quotes page for details.
- UTF-8 arrows in ls. See this l script as an alternative
- df/ls --blocksize={decimal,binary}. Though more correct, it was deemed overkill
- ls --sort=class. Sorting by type indicator was deemed of marginal benefit
- ls --octal to output octal permissions. Using stat or find is deemed sufficient
- ls --group-numbers=locale to output thousands separators. BLOCK_SIZE or numfmt were deemed sufficient
- Suppress trailing slash with ls -F /. The result was deemed too inconsistent
- ls --just=$filetype to limit file type listed. Filtering classify tags like `ls --color -lF | sed -n 's#/$##p'` was thought sufficient
- ls -z,--zero to NUL terminate entries. Existing tools like find(1) were thought sufficient
- ls --sort=inode. It was though find ... | sort was more appropriate for this low level functionality
mv
- mv -p (create target dir). It was thought more functional to just `mkdir -p` first
- mv --symbolic-link. It was thought that mv and ln --relative separately give more control
- mv --swap. To swap two files, a shell script (prehaps provided by coreutils) would be best
- mv --safe to only remove source on completion. `cp ... && rm` was deemed sufficient
- mv --parents to recreate a hierarchy. Using cp -l --parents is deemed sufficient
rm
- rm --parents. Deleting the opposite way up the tree was deemed too dangerous
- rm -d. rmdir is equivalent and less confusing
- rm --no-preserve-root. Adding protective prompts would not significantly improve security
- rm -rf . to delete current directory. It was thought existing support for rm -rf "$PWD" suffices
- rm -rf . to delete all files inside current directory. `rm -rf * .[!.] .??*` and `find . -delete` were deemed sufficient
- rm -s to behave in a “smarter” fashion. `rm -I` or `find | xargs rm` were deemed sufficient
- rm --exclude to exclude file names. Existing tools like find(1) were thought sufficient
- rm should use remove(), leaving unlink() to the unlink command. We can't change such standardized functionality
shred
- shred --recursive.
Deemed better to explicitly select (with
find
for example) - shred -r. shred is of limited use with files anyway
sort
- sort --by-length. A “sort by line length” example was added to the info docs
- min,max commands.
(the
sort --range={}
alternative seems useful though) - sort -V auto ignores white-space. One can do that more generally with -b
- sort -I to sort IP addresses. It's debatable whether this is warranted
- SORT_BUFFER_SIZE=1234 sort. env vars can be useful when shared by many commands, but are best avoided
- sort to use /var/tmp by default. It was thought best to keep using /tmp as its tmp files are stateless
- sort fixed width fields. This is already supported with: sort -d$'\n' -k1.5,1.9 ...
- sort --header to exclude leading lines from the sort. sed, head, etc. were deemed sufficient
stat
- stat --list-fstypes. The internally supported file system IDs were thought best not exposed
- stat --files0-from=FILE. This is only needed for commands needing to process all arguments in a single invocation
- stat --digest-type=WORD. It was thought better to use the existing checksum utils and join the file names etc. separately
- stat --quoting-style=WORD. Adjustments to --format='%N' were thought more appropriate
*sum
- md5sum --threads. The UNIX toolkit already handles processing files in parallel
- md5sum --base32. There was little interest in this Internet Archive specific functionality
- configurable md5sum buffer size. It was thought better to use NFS parameters to minimize network latency, or the stdbuf utility to control the buffering more generally
- md5sum --threads. UNIX tools were deemed good enough to process separate files in parallel
- sha1sum --raw | base64. `openssl dgst -sha1 -binary $file | openssl enc -base64` was deemed available enough
- *sum --ignore-dirs. The use cases were seen as too limited
- md5sum --pipe to output checksum to file and data to stdout. tee suffices for this
- *sum --color to colorize the checksum to ease comparisons. This was thought more flexible to perform with separate tools
- *sum --no-filename to only output the checksum. Postprocessing the output was deemed sufficient
touch
- touch -R. `find . -exec touch -am {} +` is more general
- touch --mode. Not deemed beneficial enough
- touch --verbose. This could not be implemented robustly. Also xargs --verbose or (set -x; touch *) are sufficient
- touch --create to only create files. `test -e file || touch file` was deemed sufficient
uniq
- uniq --unsorted. This would add a lot of complexity that's already contained within sort
- uniq --ignore-last-fields. `rev | uniq -f | rev` was deemed sufficient
- uniq --acumulate. Adding values was thought too specific for coreutils and is available elsewhere
- uniq --check-fields=N to only check N fields. uniq --key would be a more general solution
- uniq -c --total. piping to awk '{t+=$1}END{print t,"total"}1' was deemed sufficient
- uniq --regex to use a regular expression to match lines. Existing tools using a DSU pattern is more general
wc
- wc --tab-width. Preprocessing with expand is more functional
- wc -q to suppress the file name. Redirecting file to stdin is sufficient
- wc --max-chars=N to filter out long lines. Existing filters like awk 'length($0) <= 3' were deemed more appropriate
misc
- tr -0. Can do the same thing with marginally more tr syntax
- --at options for commands. This functionality was not deemed required for shell
- command --examples. This would need to be accepted into the GNU Coding Standards first
- sleep --random. The existing tools to achieve this were deemed sufficient
- remove chown,cpio user: shortcut. This was deemed useful and so retained
- mktemp -tp. It was thought better to create a fifo in a temp dir, rather than a temp fifo directly
- mktemp --fifo. This was not deemed warranted
- hostname -b. Setting a default hostname is too platform dependent
- truncate -s +50%. Percentage calculation was thought best handled outside of truncate
- fold --indent. `fmt -t | sed 's/^ / /'` was deemed sufficient
- fold --prefix to add a prefix to each line. fmt and/or sed are deemed sufficient
- pr --fold to wrap lines. fold or fmt can do this before processing by pr
- users --all to show even non logged in users. The system interfaces aren't general enough to support this
- users -h to show help. --h{,elp} was deemed sufficient
- groups -0 to support group names with spaces etc. Instead the more standard id -Gnz is provided
- BLOCK_SIZE={binary,decimal} to add 'iB' and 'B' suffixes to various “human” numbers. numfmt can fill this role
- test -ed. Using stat in a shell function was deemed sufficient
- tac -z. tac -s $'\0' is equivalent
- mkdir --reference to copy permissions. Using umasks and ACLs was deemed sufficient
- mkdir -m ... --parents-mode to use the mode for all created dirs. This would provide any functional benefit over using chmod
- chroot --before to determine UIDs outside the chroot. This is a bit specialized for inclusion
- uname --distro. lsb_release --id was deemed sufficient
- uname -i and -p should infer hardware info. It was thought better to just use the provided syscall info
- split --balanced to balance lines across the last two buckets. split -nl/$num supports this better
- rmdir -r. rm -r was deemed sufficient
- rmdir --one-file-system. `rm` or `find` are more suitible for such edge-cases
- realpath -t -b. Short options for existing long options were not seen as appropriate
- readlink -f output/trailing/slash/. It's easy to add the '/' in shell if needed
- expand --auto-tabs. It wasn't thought to give much benefit over just specifying --tabs
- seq --format support for general printf formats. Prefixing etc. is best done outside of seq
- head --read-all-input. It was thought adding more control to tee was a more general solution
- echo -- -e to terminate options in the common way. This would violate POSIX. Use printf instead
- expand --auto to auto determine tab stops. The operation would not be general enough
- csplit '@1' to split when field 1 changes. uniq --group | csplit --suppress-matched was thought to be better
- cplit --output=N to output only the Nth file. This was thought too specialized to support
- ln --absolute to force absolute symlinks. It's easy to get absolute paths with realpath or $PWD
- timeout setting a TIMEOUT env var. The use case is unusual and supported with explicitly setting vars with env etc.
- yes -n to not output a '\n'. yes whatever | tr -d '\n' was thought sufficient
- New environment variables are discouraged.
Wrapper shell scripts and shell aliases are preferred.
See $LS_ARGS, $SORT_BUFFER_SIZE, $HUMAN_B, rm; Similarly, $GREP_OPTIONS and $GZIP. - Global configuration file /etc/gnu.conf. Aliases, shell functions and shell script wrappers are the recommended ways to modify default behaviour of coreutils programs.
New commands
- a path manipulation command. Existing tools deemed rich enough
- Add a sparse command. cp already supports creating sparse files
- Add a getlimits command. This would not be standard enough outside of a particular project
- quoted-printable. recode or perl can easily decode this format
- An errno utility. A full C wrapper around strerror() was deemed overkill. Maybe we'll add a script to contrib/
- where am i. `hostname; pwd` is fine
- tableize. This new command was tought better done as a --border option to column -t
- physmem. Programs to print mem info like `hwloc-info` and `free` are already available
- Provide '0' and '1' utils. These were not seen to benefit shell syntax
- cksum -a algo1 algo2 (like NetBSD) to do many checksums per read. It was thought more general to use separate processes
- rename command (from util-linux). There are existing commands to do this, and adjusting for inclusion in coreutils was thought to create too much flux
- testline command to expose bloom filter functionality. It was thought options to existing tools were more appropriate