Search code examples
pathzshfilenames

zsh: Splitting a path that may not have an extension


In a zsh script, I want to split the filename component of a path into three pieces: the root, the . separator (that may not be there), and the extension. Another process is going to modify the pieces and join them back together.

Determining whether the input path has a . was a bit more complicated than expected. So far this is the best answer I've found:

split=( "${pth:r}" "${${pth#${pth:r}}:+.}" "${pth:e}" )

It uses zsh's r and e parameter flags to get the root and extension; those parts work well. The more cryptic expansion for the middle component is essentially comparing the path root and the original path. If they are the same, then there is no separator and the value is set to an empty string. Otherwise it is set to a period.

It seems there should be an easier option than a three-part nested substitution. Is there a flag or something simple that I'm missing, or an SO post that my searches aren't finding?


Code for testing:

#!/bin/zsh
testsplit() {
  local pth=$1
  local split=( "${pth:r}" "${${pth#${pth:r}}:+.}" "${pth:e}" )

  print "input:    [$pth]"
  print "split[${#split}]:" "[${(@)^split}]"

  # tests: does rejoined path match input; is second element '.' or empty
  [[ ${(j::)split} == $pth && ${split[2]:-.} == '.' ]]
}
typeset -a testpaths=(
  "base.ext"
  "endsindot."
  "no_ext"
  "/a.b/c.d/e"
  "/a.b/c.d/e."
  "/a.b/c.d/e.f"
  "/has/s p aces  /before . after"
  $"/*/'and?[sdf]\n>\t\tpat.tern[^r\`ty&]///*/notreal.d"
)
integer ecount=0
print '.'
for p in ${testpaths}; do
  testsplit "$p"
  (($?)) && ((++ecount)) && print "=== ERROR ==="
  print '.'
done
print "Error count: [$ecount]."
((ecount)) && print "=== ERRORS FOUND ==="

Output:

.
input:    [base.ext]
split[3]: [base] [.] [ext]
.
input:    [endsindot.]
split[3]: [endsindot] [.] []
.
input:    [no_ext]
split[3]: [no_ext] [] []
.
input:    [/a.b/c.d/e]
split[3]: [/a.b/c.d/e] [] []
.
input:    [/a.b/c.d/e.]
split[3]: [/a.b/c.d/e] [.] []
.
input:    [/a.b/c.d/e.f]
split[3]: [/a.b/c.d/e] [.] [f]
.
input:    [/has/s p aces  /before . after]
split[3]: [/has/s p aces  /before ] [.] [ after]
.
input:    [$/*/'and?[sdf]
>       pat.tern[^r`ty&]///*/notreal.d]
split[3]: [$/*/'and?[sdf]
>       pat.tern[^r`ty&]///*/notreal] [.] [d]
.
Error count: [0].

Solution

  • It seems there should be an easier option than a three-part nested substitution.

    Perhaps, but unfortunately, there really isn't. 🙂

    Here's how I would do it, but again, a nested substitution cannot be avoided:

    % split() { 
      local split=( "$1:r" "${${1#$1:r}[1]}" "$1:e" )
      print "${(q@)split}"
    }
    % split foo.orig.c 
    foo.orig . c
    % split dir.c/foo 
    dir.c/foo '' ''