Search code examples
pathcommon-lisp

Behaviour of parse-pathname (at least on sbcl)


I know that the rename-file standard function has a somewhat vague specification:

rename-file modifies the file system in such a way that the file indicated by filespec is renamed to defaulted-new-name.

I also know that there is a long standing issue in sbcl (it uses POSIX rename so it has the same limitations, especially crossing filesystems boundaries).

However there is a behaviour I find quite problematic: let's have a file called testfile.tmp, and I want to rename it to testfile (without extension, i.e. with empty pathname-type). Here I use the sbcl read syntax (but even with parse-namestring the result is the same):

(rename-file #p"testfile.tmp" #p"newfile")
=>
#P"/home/mrclnz/newfile.tmp"
#P"/home/mrclnz/testfile.tmp"
#P"/home/mrclnz/newfile.tmp"

…I've checked in the rename-file implementation and it does a merge-pathnames. The issue is that parse-namestring gives nil type and it get merged with the source. This of course is not good.

Using uiop:ensure-pathname I get an :unspecific type so the merge and the renames work fine. On the other hand the printed representation for both is the same, as in:

(parse-namestring "newfile") => #P"newfile"
(uiop:ensure-pathname "newfile") => #P"newfile"

but, looking inside I have:

(pathname-type (parse-namestring "newfile")) => NIL
(pathname-type (uiop:ensure-pathname "newfile")) => :UNSPECIFIC

(good luck with print/read roundtrips… maybe they should have made that unreadable?)

So, at the end, what's happening? is all of this legal (as in, unspecified in parse-namestring; rename-file seems innocent to me, everything using merge-pathnames is affected)? should I simply avoid parse-namestring as broken? probably the sbcl people didn't think about pathnames without a type (I checked it, even sb-ext:native-pathname uses nil as type).

What's the idiomatic way to handle this?


Solution

  • What coredump wrote is correct, but to answer your specific questions:

    • yes, this is conforming behaviour: rename-file is documented to merge the second pathname using the first as the default, so it is doing the right thing here;
    • SBCL's parse-namestring is fine: parse-namestring's job is just to return some kind of pathname, and it's doing that;
    • (and #P is not SBCL-specific by the way, it's just a way of calling parse-namestring at read-time).

    As coredump says, nil and :unspecific are printed identically when printing pathnames: the difference between them is precisely that nil means 'this field is not filled but could be' and :unspecific means 'this is not present and should not be filled', see here.

    And in fact the behaviour SBCL (and, at least, LispWorks and Clozure) has, where (pathname-type (parse-namestring "x")) is nil is generally what you want, because fairly often you want to merge pathnames like this to add extensions to them. Certainly I do that quite a lot! I can't see that it's not allowed to return :unspecific, but I would rather it did not.

    So the underlying problem here is that the CL pathname system is just somewhat deficient, and designed with support for filesystems which no longer really exist in mind. In particular designed with filesystems where file types were a real thing rather than a naming convention and where all files had a type.

    The problem then is that you either need to assume (slightly nonportably, but only very slightly so) that :unspecific as a pathname type (or other component where it may be allowed), or somehow persuade the system to hand you a name with a type of :unspecific if it's willing to which you can then use (if it won't hand you such a name then presumably you're working on a filesystem where types are mandatory anyway!).

    I can't think of a way of portably doing the second thing. Probably in practice the right approach is then to use some portability layer like UIOP to solve the problem, but some kind of minimalist answer would be a function like:

    (defun ensure-unmergable-type (p)
      (merge-pathnames (pathname p)
                       (load-time-value
                        (make-pathname :type ':unspecific))))
    

    Again, this is slightly non-portable and could in theory signal an error at load time, but in practice it will probably work on anything you're likely to use.