Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pathlib] issues with Windows device paths #78079

Closed
eryksun opened this issue Jun 18, 2018 · 10 comments
Closed

[pathlib] issues with Windows device paths #78079

eryksun opened this issue Jun 18, 2018 · 10 comments
Labels
3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes OS-windows stdlib Python modules in the Lib dir topic-pathlib type-bug An unexpected behavior, bug, or error

Comments

@eryksun
Copy link
Contributor

eryksun commented Jun 18, 2018

BPO 33898
Nosy @pfmoore, @pitrou, @tjguk, @zware, @eryksun, @zooba
PRs
  • bpo-33898: Fix pathlib issues with Windows device paths #8671
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2018-06-18.22:40:29.969>
    labels = ['type-bug', '3.8', '3.9', '3.7', 'library', 'OS-windows']
    title = 'pathlib issues with Windows device paths'
    updated_at = <Date 2020-04-17.17:59:50.770>
    user = 'https://github.com/eryksun'

    bugs.python.org fields:

    activity = <Date 2020-04-17.17:59:50.770>
    actor = 'pitrou'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)', 'Windows']
    creation = <Date 2018-06-18.22:40:29.969>
    creator = 'eryksun'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 33898
    keywords = ['patch']
    message_count = 2.0
    messages = ['319921', '337405']
    nosy_count = 6.0
    nosy_names = ['paul.moore', 'pitrou', 'tim.golden', 'zach.ware', 'eryksun', 'steve.dower']
    pr_nums = ['8671']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue33898'
    versions = ['Python 3.7', 'Python 3.8', 'Python 3.9']

    Linked PRs

    @eryksun
    Copy link
    Contributor Author

    eryksun commented Jun 18, 2018

    For \\?\ extended device paths, pathlib removes the trailing slash for the root directory if the device isn't UNC or a logical (A-Z) drive.

    Correct:

        >>> str(Path('//?/UNC/'))
        '\\\\?\\UNC\\'
    
        >>> str(Path('//?/Z:/'))
        '\\\\?\\Z:\\'

    Incorrect:

        >>> str(Path('//?/BootPartition/'))
        '\\\\?\\BootPartition'
    
        >>> str(Path('//?/Volume{}/'))
        '\\\\?\\Volume{}'
    
        >>> str(Path('//?/Global/Z:/'))
        '\\\\?\\Global\\Z:'

    It keeps the trailing slash for some \\.\ paths, but not all.

    Correct:

        >>> str(Path('//./BootPartition/'))
        '\\\\.\\BootPartition\\'

    Incorrect:

        >>> str(Path('//./Global/Z:/'))
        '\\\\.\\Global\\Z:'

    It adds a root directory to \\.\ device paths where none was specified.

    Incorrect:

        >>> str(Path('//./nul'))
        '\\\\.\\nul\\'
    
        >>> str(Path('//./PhysicalDrive0'))
        '\\\\.\\PhysicalDrive0\\'
    
        >>> str(Path('//./C:'))
        '\\\\.\\C:\\'

    "\\\\.\\C:" refers to the volume device, whereas "\\\\.\\C:\\" is the root directory in the file system.

    pathlib should parse \\?\ and \\.\ device paths the same way with respect to the drive and root. The difference in practice is only how Windows does (\\.\) or does not (\\?\) canonicalize the path.

    Additionally, pathlib fails to identify the drive correctly in these cases.

    Incorrect:

        >>> Path('//?/Global/Z:/').drive
        '\\\\?\\'
    
        >>> Path('//?/BootPartition/Temp').drive
        '\\\\?\\'

    Except for "UNC" and "Global" paths, the drive should be the first component after the local-device prefix. The "UNC" device also includes subsequent server and share components, if any. For the reserved "Global" symlink, it should look to the next component. For example, r'\\?\Global\UNC\server\share' is a drive.

    There's also the "GlobalRoot" symlink (or "Global\\GlobalRoot" to be pedantic), but that can't be handled generically.

    @eryksun eryksun added 3.7 (EOL) end of life 3.8 (EOL) end of life stdlib Python modules in the Lib dir OS-windows type-bug An unexpected behavior, bug, or error labels Jun 18, 2018
    @zooba
    Copy link
    Member

    zooba commented Mar 7, 2019

    Eryk - I've got the PR ready as far as I'm concerned. Have you had a look to see if you're happy with its logic for these cases?

    @pitrou pitrou added the 3.9 only security fixes label Apr 17, 2020
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @eryksun eryksun changed the title pathlib issues with Windows device paths [pathlib] issues with Windows device paths Aug 8, 2022
    @barneygale
    Copy link
    Contributor

    barneygale commented Jan 29, 2023

    @eryksun I suppose that any remaining pathlib device path normalization issues are down to the if condition here:

    cpython/Lib/pathlib.py

    Lines 284 to 286 in 666c084

    if drv.startswith(sep):
    # pathlib assumes that UNC paths always have a root.
    root = sep

    Can you recommend a better condition? :)

    @zooba
    Copy link
    Member

    zooba commented Feb 2, 2023

    I think this is the issue that requires splitroot to be correct (which it might be by now?) and then pathlib has to be very deliberate about the meaning of a trailing slash in the drive part vs a leading slash in the rest of the path.

    The current PR doesn't seem to be easily resolvable, but is probably well out of date anyway because of other work that's been done in pathlib and ntpath.

    The existing test suite and the examples Eryk posted above are probably the best way to figure out the right semantics.

    @barneygale
    Copy link
    Contributor

    Based on Eryk's test cases, ntpath.splitroot() gets //?/Global and //./Global paths wrong:

    >>> ntpath.splitroot('//./Global/Z:/')
    ('//./Global', '/', 'Z:/')
    >>> ntpath.splitroot('//?/Global/Z:/')
    ('//?/Global', '/', 'Z:/')

    I believe that all the other failing test cases can be put down to pathlib erroneously adding roots to paths like //./C:, //./nul, //./PhysicalDrive0 and //?/UNC/.

    @zooba
    Copy link
    Member

    zooba commented Feb 2, 2023

    I believe that all the other failing test cases can be put down to pathlib erroneously adding roots to paths

    Yeah, this is what I expected with "deliberate about the meaning of ... leading slash in the rest of the path". I know we've encountered these consistency issues before, but I don't know whether they were all resolved.

    barneygale added a commit to barneygale/cpython that referenced this issue Feb 17, 2023
    We no longer add a root to device paths such as `//./PhysicalDrive0`,
    `//?/BootPartition` and `//./c:` while normalizing. We also avoid adding a
    root to incomplete UNC share paths, like `//`, `//a` and `//a/`.
    barneygale added a commit to barneygale/cpython that referenced this issue Feb 17, 2023
    We no longer add a root to device paths such as `//./PhysicalDrive0`,
    `//?/BootPartition` and `//./c:` while normalizing. We also avoid adding a
    root to incomplete UNC share paths, like `//`, `//a` and `//a/`.
    barneygale added a commit to barneygale/cpython that referenced this issue Feb 17, 2023
    We no longer add a root to device paths such as `//./PhysicalDrive0`,
    `//?/BootPartition` and `//./c:` while normalizing. We also avoid adding a
    root to incomplete UNC share paths, like `//`, `//a` and `//a/`.
    @barneygale
    Copy link
    Contributor

    I've opened a PR that fixes all of Eryk's test cases except //./Global and //?/Global: #102003

    barneygale added a commit to barneygale/cpython that referenced this issue Feb 20, 2023
    barneygale added a commit to barneygale/cpython that referenced this issue Mar 10, 2023
    thunze added a commit to thunze/diskfs that referenced this issue Mar 21, 2023
    This is done because `pathlib.Path` automatically adds a trailing
    backslash to Windows paths of the shape `\\.\PhysicalDriveN`.
    Trying to open such a path including the trailing backslash raises
    `PermissionError`. This issue is tracked at python/cpython#78079.
    barneygale added a commit to barneygale/cpython that referenced this issue Apr 3, 2023
    barneygale added a commit to barneygale/cpython that referenced this issue Apr 9, 2023
    barneygale added a commit to barneygale/cpython that referenced this issue Apr 11, 2023
    barneygale added a commit that referenced this issue Apr 14, 2023
    We no longer add a root to device paths such as `//./PhysicalDrive0`,
    `//?/BootPartition` and `//./c:` while normalizing. We also avoid adding a
    root to incomplete UNC share paths, like `//`, `//a` and `//a/`.
    
    Co-authored-by: Eryk Sun <[email protected]>
    @barneygale
    Copy link
    Contributor

    barneygale commented Apr 14, 2023

    Hey @eryksun, do you think we should leave this issue open to cover the remaining "Global" and "GLOBALROOT" issues? The problem there lies mostly in ntpath.splitroot(), so I might remove the expert-pathlib label, or even create a new issue.

    carljm added a commit to carljm/cpython that referenced this issue Apr 17, 2023
    * main:
      Remove `expert-*` from `project-updater` GH workflow (python#103579)
      pythongh-103583: Add codecs and maps to _codecs_* module state (python#103540)
      pythongh-48330: address review comments to PR-12271 (python#103209)
      pythongh-103527: Add multibytecodec.h as make dep for _codecs_* (python#103567)
      pythongh-103553: Improve `test_inspect`: add more assertions, remove unused (python#103554)
      pythonGH-103517: Improve tests for `pathlib.Path.walk()` (pythonGH-103518)
      pythongh-102114: Make dis print more concise tracebacks for syntax errors in str inputs (python#102115)
      pythonGH-78079: Fix UNC device path root normalization in pathlib (pythonGH-102003)
      pythongh-101517: Add regression test for a lineno bug in try/except* impacting pdb (python#103547)
      pythongh-103527: Add make deps for _codecs_* and _multibytecodec (python#103528)
      pythongh-103532: Fix reST syntax in NEWS entry (pythonGH-103544)
      pythongh-103532: Add NEWS entry (python#103542)
    carljm added a commit to carljm/cpython that referenced this issue Apr 17, 2023
    * superopt:
      update generated cases with new comment
      review comments
      Remove `expert-*` from `project-updater` GH workflow (python#103579)
      pythongh-103583: Add codecs and maps to _codecs_* module state (python#103540)
      pythongh-48330: address review comments to PR-12271 (python#103209)
      pythongh-103527: Add multibytecodec.h as make dep for _codecs_* (python#103567)
      pythongh-103553: Improve `test_inspect`: add more assertions, remove unused (python#103554)
      pythonGH-103517: Improve tests for `pathlib.Path.walk()` (pythonGH-103518)
      pythongh-102114: Make dis print more concise tracebacks for syntax errors in str inputs (python#102115)
      pythonGH-78079: Fix UNC device path root normalization in pathlib (pythonGH-102003)
      pythongh-101517: Add regression test for a lineno bug in try/except* impacting pdb (python#103547)
      pythongh-103527: Add make deps for _codecs_* and _multibytecodec (python#103528)
      pythongh-103532: Fix reST syntax in NEWS entry (pythonGH-103544)
      pythongh-103532: Add NEWS entry (python#103542)
    @zooba
    Copy link
    Member

    zooba commented Apr 17, 2023

    I vote for a new issue (with a more descriptive title), and would also say it's not a high priority. If you can describe it well enough, might even be able to justify tagging it "easy" and offer it up for someone at sprints.

    @barneygale
    Copy link
    Contributor

    Logged here:

    I don't understand the rules for "Global" and "GLOBALROOT" so it's not very well described :)

    Resolving this issue! Thanks all.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes OS-windows stdlib Python modules in the Lib dir topic-pathlib type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants