Search code examples
perlglob

Why does Perl's glob() function always return a file name when given a string with no globbing characters?


I gave a list of globs and one string to Perl's glob function. The globs were treated as expected but the string is always found. For example:

$ ls
foo
$ perl -le '@files=glob("*bar"); print @files' ## prints nothing, as expected
$ perl -le '@files=glob("bar"); print @files'
bar

As you can see above, the second example prints bar even though no such file exists.

My first thought is that it behaves like the shell in that when no expansion is available, a glob (or something being treated as a glob) expands to itself. For example, in csh (awful as it is, this is what Perl's glob() function seems to be following, see the quote below):

% foreach n (*bar*)
foreach: No match.

% foreach n (bar)
foreach? echo $n
foreach? end
bar                     ## prints the string

However, according to the docs, glob should return filename expansions (emphasis mine):

In list context, returns a (possibly empty) list of filename expansions on the value of EXPR such as the standard Unix shell /bin/csh would do.

So why is it returning itself when there are no globbing characters in the argument passed to glob? Is this a bug or am I doing something wrong?


Solution

  • I guess I expected Perl to be checking for file existence in the background.

    Perl is checking for file existence:

    $ strace perl -e'glob "foo"' 2>&1 | grep foo
    execve("/home/mcarey/perl5/perlbrew/perls/5.24.0-debug/bin/perl", ["perl", "-eglob \"foo\""], [/* 39 vars */]) = 0
    lstat("foo", {st_mode=S_IFREG|0664, st_size=0, ...}) = 0
    

    So why is it returning itself when there are no globbing characters in the argument passed to glob?

    Because that's what csh does. Perl's implementation of glob is based on glob(3) with the GLOB_NOMAGIC flag enabled:

    GLOB_NOMAGIC

    Is the same as GLOB_NOCHECK but it only appends the pattern if it does not contain any of the special characters *, ? or [. GLOB_NOMAGIC is provided to simplify implementing the historic csh(1) globbing behavior and should probably not be used anywhere else.

    GLOB_NOCHECK

    If pattern does not match any pathname, then glob() returns a list consisting of only pattern...

    So, for a pattern like foo with no wildcards:

    • if a matching file exists, the filename expansion (foo) is returned
    • if no matching file exists, the pattern (foo) is returned

    Since the filename expansion is the same as the pattern,

    glob 'foo'
    

    in list context will always return a list with the single element foo, whether the file foo exists or not.