Search code examples
pathzsh

Sanitizing PATH in case of duplicates


I have set

typeset -aU path

which helps in avoiding duplicates in my PATH. The duplicate is not added if I add it at the end to my PATH:

path+=~/bin
path+=~/foo
path+=~/bin

After this, my PATH contains only one copy of my bin directory.

If I put a directory at the start of the PATH (which is rarely done, but sometimes necessary), i.e.

PATH=~/bin:$PATH

I will end up with an additional copy of my bin directory in my PATH. Is it possible to also automatically remove duplicate directories in this case?

I can force this manually with the help of an auxiliary array, i.e.

temppath=($path)
path=($temppath)

but I wonder if there is a simpler way to do this.


Solution

  • It looks like the -U (unique) attribute can be set separately for each variable in a pair of tied parameters like path and PATH. The two names essentially act as distinct interfaces to a shared bit of data, and the behavior changes when different interfaces are used to set and retrieve that data.

    By setting -U for both path and PATH, the shell should remove duplicates no matter how entries are added:

    typeset -U PATH path
    

    This is all that's absolutely necessary, since by default the -T (tied) and -a (array) attributes for path/PATH are already set. For a new variable that behaves similarly, the declaration could look like this:

    typeset -aUT LD_LIBRARY_PATH ld_library_path
    

    Testing:

    => typeset -aUT PATH path
    => PATH=/usr/bin
    => path+=~/bin
    => typeset -p path
    typeset -aUT PATH path=( /usr/bin /Users/me/bin )
    => path+=~/foo
    => path+=~/bin
    => typeset -p path
    typeset -aUT PATH path=( /usr/bin /Users/me/bin /Users/me/foo )
    => PATH=~/bin:$PATH
    => typeset -p path
    typeset -aUT PATH path=( /Users/me/bin /usr/bin /Users/me/foo )
    => path=(~/foo $path)
    => typeset -p path
    typeset -aUT PATH path=( /Users/me/foo /Users/me/bin /usr/bin )
    

    With -U set differently for PATH and path:

    => path=()
    => typeset -U path; typeset +U PATH
    => typeset -p PATH path
    typeset -T PATH path=(  )
    typeset -aUT PATH path=(  )
    => PATH=/foo:/foo:/foo
    => typeset -p PATH path
    typeset -T PATH path=( /foo /foo /foo )
    typeset -aUT PATH path=( /foo /foo /foo )
    => path+=/bar; path+=/bar; path+=/bar
    => typeset -p PATH path
    typeset -T PATH path=( /foo /bar )
    typeset -aUT PATH path=( /foo /bar )
    => PATH=/foo:$PATH; PATH=/foo:$PATH;
    => typeset -p PATH path
    typeset -T PATH path=( /foo /foo /foo /bar )
    typeset -aUT PATH path=( /foo /foo /foo /bar )
    
    => path=()
    => typeset +U path; typeset -U PATH
    => typeset -p PATH path
    typeset -UT PATH path=(  )
    typeset -aT PATH path=(  )
    => path+=/alice; path+=/alice; path+=/alice
    => typeset -p PATH path
    typeset -UT PATH path=( /alice /alice /alice )
    typeset -aT PATH path=( /alice /alice /alice )
    => PATH=$PATH:/bob; PATH=$PATH:/bob; PATH=$PATH:/bob; 
    => typeset -p PATH path
    typeset -UT PATH path=( /alice /bob )
    typeset -aT PATH path=( /alice /bob )
    

    Bonus - path=($^~path(/N)) can be used to remove non-existent directories and invalid entries from path:

    => typeset -p path   
    typeset -aUT PATH path=( /Users/me/foo /Users/me/bin /etc/hosts /usr/bin )
    => path=($^~path(/N))
    => typeset -p path   
    typeset -aUT PATH path=( /Users/me/bin /usr/bin )
    
    • ${^...} - emulate brace expansion. Each element in the array is expanded with the surrounding text, e.g.: /Users/me/foo(/N) /Users/me/bin(/N) /etc/hosts(/N) /usr/bin(/N).
    • ${~...} - glob expansion. The resulting strings are used as glob patterns.
    • (/) - only include directories in the glob expansions.
    • (N) - allow empty glob results. So:
      /Users/me/foo(/N) expands to nothing / null since it does not exist.
      /Users/me/bin(/N) becomes /Users/me/bin.
      /etc/hosts(/N) is null because it's a file.
      /usr/bin(/N) becomes /usr/bin.
    • path=(...) - resets path to whatever remains from the expansion. If -U is set for path, then only unique values are added.

    MacOS sometimes adds pointless entries via /etc/paths.d, this is a way to clean them out.