Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pathlib .suffix, .suffixes, .stem unexpected behavior for pathname with trailing dot #82805

Closed
inyeollee mannequin opened this issue Oct 28, 2019 · 7 comments
Closed
Assignees
Labels
3.14 new features, bugs and security fixes stdlib Python modules in the Lib dir topic-pathlib type-bug An unexpected behavior, bug, or error

Comments

@inyeollee
Copy link
Mannequin

inyeollee mannequin commented Oct 28, 2019

BPO 38624
Nosy @pitrou

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-10-28.22:54:26.232>
labels = ['3.8', 'type-bug', 'library']
title = 'pathlib .suffix, .suffixes, .stem unexpected behavior for pathname with trailing dot'
updated_at = <Date 2019-10-29.01:59:29.319>
user = 'https://bugs.python.org/inyeollee'

bugs.python.org fields:

activity = <Date 2019-10-29.01:59:29.319>
actor = 'xtreak'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2019-10-28.22:54:26.232>
creator = 'inyeollee'
dependencies = []
files = []
hgrepos = []
issue_num = 38624
keywords = []
message_count = 1.0
messages = ['355600']
nosy_count = 2.0
nosy_names = ['pitrou', 'inyeollee']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue38624'
versions = ['Python 3.8']

Linked PRs

@inyeollee
Copy link
Mannequin Author

inyeollee mannequin commented Oct 28, 2019

Python3.8 pathlib treats dot between path stem and suffix as part of suffix in general:

>>> a = pathlib.Path('foo.txt')
>>> a.stem, a.suffix
('foo', '.txt')
>>> a.with_suffix('')
PosixPath('foo')

However, if pathname ends with dot, it treats the trailing dot as part of stem, not part of suffix:

>>> b = pathlib.Path('bar.')
>>> b.stem, b.suffix
('bar.', '')

This looks like a bug. It should return ('bar', '.').
There are couple of unexpected behavior related to this:

>>> pathlib.Path('foo.txt').with_suffix('.')
...
ValueError: Invalid suffix '.' <== Why not PosixPath('foo.') ?
>>> c = pathlib.Path('foo..')
>>> c.stem, c.suffix, c.suffixes
('foo..', '', [])

I think above should return ('foo.', '.', ['.', '.'])

Tested with macOS 10.15 and Python3.8. Python3.7 behaves the same.

@inyeollee inyeollee mannequin added 3.8 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Oct 28, 2019
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@ketozhang
Copy link

ketozhang commented Aug 14, 2022

In my attempt to work on this bug, I found it is not possible to make a fix without deciding whether files with trailing dots meaningfully contain a suffix. Currently, Python implicitly defines strings with trailing dots to be an invalid suffix, but does not error:

# Yes, obvious case
>>> Path("foo.tar").suffix
'.tar'

# Trailing dot is not a suffix
>>> Path("foo.tar.").suffix
''

# Consistent with before, but confusingly you can create one with `with_suffix()`
>>> Path("foo").with_suffix(".tar.")
PosixPath('foo.tar.')
>>> PosixPath('foo.tar.').with_suffix(".tar.").suffix
''

So, this isn't a bug if "trailing dots is not a valid suffix". However, this rule is seemingly ignored in with_suffix() since we are able to pass in a string with a trailing dot. At the same time it is not true that p.with_suffix(s).suffix == s since you can pass in compound suffixes.

My proposed solution (outside the scope of this issue) is to change with_suffix such that it raises an exception when passed a string with trailing dots.

@barneygale
Copy link
Contributor

I agree with the original bug report - Path("foo.tar.").suffix should give you '.'. That would match os.path.splitext() behaviour.

@ketozhang
Copy link

@barneygale Good point, then it must be that

>>> Path("foo.tar.").suffixes
[".tar", "."]

As well as we needing to remove the restriction with with_suffix("."):

>>> Path("foo.tar").with_suffix(".")
Path("foo.tar.")

@barneygale
Copy link
Contributor

More discussion in #100157, which I've closed as a duplicate.

@barneygale barneygale added 3.14 new features, bugs and security fixes and removed 3.8 (EOL) end of life labels May 8, 2024
barneygale added a commit to barneygale/cpython that referenced this issue May 11, 2024
pathlib now treats "`.`" as a valid file extension (suffix). This brings
it in line with `os.path.splitext()`.

In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method
that splits a path into a `(root, ext)` pair, like `os.path.splitext()`.
This method is called by `PurePathBase.stem`, `suffix`, etc. In a future
version of pathlib, we might make these base classes public, and so users
will be able to define their own `splitext()` method to control file
extension splitting.

In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes`
properties that don't use `splitext()`, which avoids computing the path
base name twice.
barneygale added a commit to barneygale/cpython that referenced this issue May 11, 2024
pathlib now treats "`.`" as a valid file extension (suffix). This brings
it in line with `os.path.splitext()`.

In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method
that splits a path into a `(root, ext)` pair, like `os.path.splitext()`.
This method is called by `PurePathBase.stem`, `suffix`, etc. In a future
version of pathlib, we might make these base classes public, and so users
will be able to define their own `splitext()` method to control file
extension splitting.

In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes`
properties that don't use `splitext()`, which avoids computing the path
base name twice.
@barneygale
Copy link
Contributor

PR available: #118952

@barneygale barneygale self-assigned this May 11, 2024
barneygale added a commit to barneygale/cpython that referenced this issue May 19, 2024
barneygale added a commit that referenced this issue May 25, 2024
)

pathlib now treats "`.`" as a valid file extension (suffix). This brings
it in line with `os.path.splitext()`.

In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method
that splits a path into a `(root, ext)` pair, like `os.path.splitext()`.
This method is called by `PurePathBase.stem`, `suffix`, etc. In a future
version of pathlib, we might make these base classes public, and so users
will be able to define their own `splitext()` method to control file
extension splitting.

In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes`
properties that don't use `splitext()`, which avoids computing the path
base name twice.
@barneygale
Copy link
Contributor

Fixed in 3.14 / e418fc3 / #118952

estyxx pushed a commit to estyxx/cpython that referenced this issue Jul 17, 2024
…python#118952)

pathlib now treats "`.`" as a valid file extension (suffix). This brings
it in line with `os.path.splitext()`.

In the (private) pathlib ABCs, we add a new `ParserBase.splitext()` method
that splits a path into a `(root, ext)` pair, like `os.path.splitext()`.
This method is called by `PurePathBase.stem`, `suffix`, etc. In a future
version of pathlib, we might make these base classes public, and so users
will be able to define their own `splitext()` method to control file
extension splitting.

In `pathlib.PurePath` we add optimised `stem`, `suffix` and `suffixes`
properties that don't use `splitext()`, which avoids computing the path
base name twice.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.14 new features, bugs and security fixes stdlib Python modules in the Lib dir topic-pathlib type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants