Search code examples
perlparsing

How does Perl parse unquoted bare words? (barewords, identifiers)


Unquoted words seem to have a great many meanings in Perl.

print STDERR $msg;

$hash{key}

func( param => $arg )

my $x = str;

How does one determine the meaning of these?


Solution

  • The following chart shows how Perl resolves identifiers in order of descending priority.

    It also applies to identifiers chained by :: (or by ' before 5.42) unless otherwise stated. I'll call these "qualified identifiers".

    1. Syntactically-defined meaning, when syntactically expected.

       sub foo { }          # «foo» («sub» is covered later)
       sub main::foo { }    # «main::foo» («sub» is covered later)
       method Class         # «Class» («method» is covered later)
       method Some::Class   # «Some::Class» («method» is covered later)
       $foo
       $main::foo
       //i
       =head
       <<FOO
       Class::
       Some::Class::
       LABEL:
      
    2. String literal, when followed by a => or when the entirety of a hash index expression.

      This doesn't apply to qualified identifiers.

       my %h = ( a => 1 );
       $h{a}
      
    3. Variable name, when the entirety of the dereference expression.

       ${foo}
       ${main::foo}
      

      Note that using the name of a keyword, named operator or declared sub will result in an ambiguous use warning.

    4. Keyword.

       while (1) { }
       sub { }
       use
       __END__
      
    5. Sub call, when the name of a previously imported sub.

       use Time::HiRes qw( time );
       time
       main::time
      
    6. Invocation of a named list operator, named unary operator or named nullary operator.

       print $x, $y, $z;
       $c = chr $i;
       $t = time;
       $t = CORE::time;
      
    7. Label, when used as the operand for next, last, redo or goto.

      A qualified identifier treated as a label results in a compilation error since labels can't be qualified identifiers.

       next LABEL;
      
    8. Sub call or inlined constant, when the name of a previously declared sub or constant.

       sub foo { }
       foo                          # Calls sub «foo»
       main::foo                    # Calls sub «foo»
      
       sub bar;
       bar                          # Calls sub «bar»
      
       use constant FOO => 123;
       FOO                          # Replaced with the value of the constant.
      
    9. Indirect method call, when followed by a possibly-qualified identifier, a possibly-qualified identifier suffixed with ::, a scalar (incl array element or hash element) or a block.

       method Class           # Calls method «method» («Class» is covered earlier)
       method Some::Class     # Calls method «method» («Some::Class» is covered earlier)
       method Class::         # Calls method «method» («Class» is covered earlier)
       method Some::Class::   # Calls method «method» («Some::Class» is covered earlier)
       method $o              # Calls method «method»
       method { $o }          # Calls method «method»
      
       Base::method Class     # Calls method «Base::method» («Class» is covered earlier)
      

      You can use the no indirect pragma to warn when code is parsed this way.

    10. Glob, when used as the operand for an operator expecting a file handle.

       open(FH, '>', $qfn) or die $!;      # Equivalent to open(*FH, ...) or ...;
       print FH "Hello, World!\n";         # Equivalent to print *FH ...;
       print main::FH "Hello, World!\n";   # Equivalent to print *main::FH ...;
      
    11. String literal, in the following situations:

      • When used as the invocant of a direct method call.

          Class->method(@args)         # Uses the string «Class» as the invocant.
          Some::Class->method(@args)   # Uses the string «Some::Class» as the invocant.
        
      • When used as the operand for unary minus.

          -foo
          -foo::bar
        
      • When used as an argument for the a sub parameter with a prototype of *.

          sub myprint(*@);
          myprint(FH, "Hello, World\n");
          myprint(main::FH, "Hello, World\n");
        
    12. String literal. This is disallowed by use strict qw( subs );.

    Hopefully, I didn't miss any.

    Thanks to @mosvy, @Grinnz and @stevesliva! Each has uncovered a few cases I had missed.


    CURRENTLY MISSING:

    • SUBNAME in sort SUBNAME.

    • BEGIN and similar. They sometimes act as keyword, and sometimes as a declared sub.

    • Importing a sub named print doesn't follow the above steps.

      $ perl -M5.010 -e'
         use subs qw( time );
         eval { time; };
         say $@ =~ /Undefined sub/ ? "ok" : "bad";
      '
      ok
      
      $ perl -M5.010 -e'
         use subs qw( system );
         eval { system; };
         say $@ =~ /Undefined sub/ ? "ok" : "bad";
      '
      ok
      
      $ perl -M5.010 -e'
         use subs qw( print );
         eval { print; };
         say $@ =~ /Undefined sub/ ? "ok" : "bad";
      '
      bad
      

      I don't know what makes that one special, and I don't know if there are others or not. I was guessing it's because print doesn't have a prototype, but system has no prototype either.