Search code examples
bashwildcardglob

Basic and extended wildcard expansion combined


I'm trying to understand how extended wilcards expansion works and I got into the following problem.

In my current directory ~/t, I have two files called chair and table. With the Bash shell, command ls *(a-z)* lists both of them, but ls *(a-z) doesn't. Why? I have enabled extended globbing with shopt -s extglob.

I guess ls *(a-z) fails to list both file because in this case the * would expand to 0+ characters, as in basic wildcards expansion, and then the (a-z) wouldn't expand to anything. If (a-z) doesn't expand to anything, Bash will try to match it literally. Since I have no file in my current directory that ends in (a-z), command ls fails in this case and lists nothing.

However, according to my previous reasoning, the expression *(a-z)* should match only a filename starting with any 0+ characters, followed by the literal "(a-z)", followed by 0+ characters. But that seems is not the case, becuase ls *(a-z)* does list my files chair and table.

~/t$ ls 
chair  table

~/t$ ls *(a-z)*
chair  table

~/t/$ ls *(a-z)
ls: cannot access '*(a-z)': No such file or directory

Solution

  • You have the extglob option enabled, which is why neither glob is a syntax error in the first place.

    *(a-z) matches zero or more occurrences of the literal string a-z, not one or more characters in the range a-z. Since no files match that pattern, the string is passed literally to ls, which doesn't find a file by that name.

    *(a-z)* matches the same substring, followed by an arbitrary set of characters, so essentially behaves like * alone. It expands to chair and table, which are both passed as separate arguments to ls.

    If you want to match files starting with 0 or more lowercase letters, use *([a-z])*. Ordinary globs have no way to repeat a particular set of characters, only arbitrary characters. Extended globs are equivalent in power to regular expressions, with *([a-z]) equivalent to the regular expression [a-z]*.

    Of course, *([a-z])* doesn't really make sense, because it will match anything by simply matching 0 lowercase characters followed by anything. Something more sensible would be to match files starting with one or more lowercase characters, namely +([a-z])*.