Search code examples
c++winapic++17boost-filesystem

std::filesystem "root_name" definition broken on windows


I have the feeling the C++ filesystem standard is broken on windows. It is heavily based on Boost.filesystem and I just found a serious issue there which (likely) also exists in std::filesystem: https://github.com/boostorg/filesystem/issues/99

The essence is the definition of "root_name" and "root_directory":

root-name(optional): identifies the root on a filesystem with multiple roots (such as "C:" or "//myserver"). In case of ambiguity, the longest sequence of characters that forms a valid root-name is treated as the root-name. The standard library may define additional root-names besides the ones understood by the OS API.

root-directory(optional): a directory separator that, if present, marks this path as absolute. If it is missing (and the first element other than the root name is a file name), then the path is relative and requires another path as the starting location to resolve to a file name.

This requires e.g. "C:\foo\bar.txt" to be decomposed into:

  • root_name: "C:"
  • root_directory: "\" or "/" (does this even make sense?)
  • directory: "foo"
  • file_name "bar.txt"

The problem now: The first part of this path is not a path, at least not the original one. This comes from the interpretation on windows:

  • "C:\" is the drive "C"
  • "C:" is the current working directory on the drive "C"

Minor: How should "\foo\bar.txt" be interpreted on windows according to the above? You have a "root_directory" (which is strangely not a directory but a directory separator) but no "root_name" hence the path cannot be absolute and so you don't have a "root_directory" either. sigh.

So from this I feel that "root_name" and "root_directory" cannot be decomposed (on windows). In "C:\foo" you'll have "C:\" and in "C:foo" you'll have "C:". Or to keep the (strangely defined) "root_directory" you'd need to set decompose "C:\foo" into "C:\", "\" and "foo" and struggle with the latter: Is that an absolute path? Actually it is: "The folder 'foo' in the current working directory on drive C", quite absolute, isn't it?

But well you could say "absolute==independent of current working dir" then the "root_directory" makes sense: It would be "\" for "C:\foo" and empty for "C:foo".

So question: Is the standard wrong in defining "C:" as the "root_name" instead of "C:\" in paths like "C:\foo" or is it simply invalid usage to iterate over components of a path expecting the prefix sums to be "valid"?


Solution

  • Your interpretation of the Windows filesystem is incorrect. The directory C:\ is the root directory of the "C" drive, not "the drive 'C'". This is distinct from C:, which is the current directory of the "C" drive. Just try using the Windows shell and see how C:<stuff> behaves relative to C:\<stuff>. Both will access stuff on that drive, but both will do so starting from different directories.

    Think of it in these terms on Windows:

    • C: means "Go to the current directory of the C drive".
    • \ at the start of a path (after any root names) means "Go to the root directory of the current drive".
    • foo\ means "Go into the directory called 'foo' within whatever directory we are currently in".
    • bar.txt means "The file named 'bar.txt' in whatever directory we are currently in."

    Therefore, C:\foo\bar.txt" means: Go to the current directory of the C drive, then go to the root directory of C, then go into the 'foo' directory of the root directory of C, then access the file 'bar.txt' in the 'foo' directory of the root directory of C.

    Similarly, C:foo\bar.txt means: Go to the current directory of the C drive, then go into the 'foo' directory of the current directory of C, then access the file 'bar.txt' in the 'foo' directory of the current directory of C.

    This is how Windows paths work. This is what it means to type those things in the Windows shell. And thus, this is how Boost/std filesystem paths were designed to work.

    But well you could say "absolute==independent of current working dir"

    But that's not how std filesystem defines the concept of "absolute path":

    Absolute Path A path that unambiguously identifies the location of a file without reference to an additional starting location. The elements of a path that determine if it is absolute are operating system dependent.

    So "relative" and "absolute" are implementation-dependent. In Windows, a path is not absolute unless it contains both a root-name and a root-directory. In the Windows filesystem implementation, path("\foo\bar.txt").is_absolute() will be false.