Search code examples
pythonpython-3.xpathlib

Pathlib relative_to remote share edge case?


It seems like in pathlib, remote share paths (e.g. //server/file.ext) do not calculate reasonable relative paths for files in the root of remote shares.

https://github.com/python/cpython/blob/3.10/Lib/pathlib.py#L789

Is there a way to check if server paths are relative that is robust to files in the root of a network share?

Below are 3 examples that show what I would expect the "correct" behaviour to be, followed by a 4th example that seems incorrect.

This is correct / expected:

>>> path_a = pathlib.Path('C:\\bar\myfile.pdf')
>>> path_b = pathlib.Path('C:\\bar')
>>> path_a.relative_to(path_b)
WindowsPath('myfile.pdf')

This is correct / expected:

>>> path_a = pathlib.Path('C:\\myfile.pdf')
>>> path_b = pathlib.Path('C:\\')
>>> path_a.relative_to(path_b)
WindowsPath('myfile.pdf')

This is correct / expected:

>>> path_a = pathlib.Path('//server01/bar/myfile.pdf')
>>> path_b = pathlib.Path('//server01/bar')
>>> path_a.relative_to(path_b)
WindowsPath('myfile.pdf')

This is unexpected:

>>> path_a = pathlib.Path('//server01/myfile.pdf')
>>> path_b = pathlib.Path('//server01')
>>> path_a.relative_to(path_b)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "C:\Users\John\AppData\Local\Programs\Python\Python39\lib\pathlib.py", line 928, in relative_to
    raise ValueError("{!r} is not in the subpath of {!r}"
ValueError: '\\\\server01\\myfile.pdf\\' is not in the subpath of '\\server01' OR one path is relative and the other is absolute.

Both are absolute paths, and path_a is definitely relative to path_b.


Solution

  • A valid UNC path MUST contain two or more path components. Thus \\server01\myfile.pdf is not a valid UNC path.

    From the spec: https://learn.microsoft.com/en-us/openspecs/windows_protocols/ms-dfsc/149a3039-98ce-491a-9268-2f5ddef08192

    A UNC path can be used to access network resources, and MUST be in the format specified by the Universal Naming Convention.
    
    <servername>, <share> and <filename> are referred to as "pathname components" or "path components". A valid UNC path MUST contain two or more path components. <servername> is referred to as the "first pathname component", <share> as the "second pathname component", and so on. The last component of the path is also referred to as the "leaf component".