Search code examples
perloperatorsoperator-precedence

Perl operator precendece for a combination of list and unary operators


I came across an odd case, related to operator precendence, I guess. Consider this test program:

use strict;
use warnings;
use Test::More;

my $fn = 'dummy';
ok( ! -e $fn, 'file does not exists' );
ok( not -e $fn, 'file does not exists' );
done_testing();

The output is:

ok 1 - file does not exists
not ok 2
#   Failed test at ./p.pl line 10.
1..2
# Looks like you failed 1 test of 2.

The question is: Why does the second test fail? ($fn is assumed known to be non-existent)

See also: List Operator Precedence in Perl.


After reading perlop, my guess is that at least five operators could be involved here:


Solution

  • Why does the second test fail?

    Because Perl's parser handles ! and not differently. You can see this in Perl's grammar, which is defined in perly.y in the Perl source.

    The rule for ! kicks in as soon as the parser encounters a ! followed by a term:

        |       '!' term                               /* !$x */
                        { $$ = newUNOP(OP_NOT, 0, scalar($2)); }
    

    On the other hand, the rule for not only kicks in when the parser encounters a not followed by a list expression (a list of terms joined by commas*):

        |       NOTOP listexpr                       /* not $foo */
                        { $$ = newUNOP(OP_NOT, 0, scalar($2)); }
    

    Note that the action for both rules is the same: add a new unary opcode of type OP_NOT to the parse tree. The operand is the second argument (term or listexpr) in scalar context.


    * Or a single term, but this has very low precedence.

    Tracing the parse

    You can see the above rules in action by compiling perl with -DDEBUGGING and running with -Dpv, which turns on debug flags for tokenizing and parsing.

    Here's what the parser does with !:

    $ perl -Dpv -e'ok(! -e "foo", "bar")'
    ...
    
    Next token is token '(' (0x1966e98)
    Shifting token '(', Entering state 185
    Reading a token:
    Next token is token '!' (0x1966e98)
    Shifting token '!', Entering state 49
    Reading a token:
    Next token is token UNIOP (0x110)
    Shifting token UNIOP, Entering state 39
    Reading a token:
    Next token is token THING (0x1966e58)
    Shifting token THING, Entering state 25
    
    index:        2        3        4        5        6        7        8        9
    state:        8       15      103       68      185       49       39       25
    token:       @1 remember  stmtseq    amper      '('      '!'    UNIOP    THING
    value:        0       22 (Nullop)    rv2cv 26635928 26635928      272    const
    
    Reducing stack by rule 184 (line 961), THING -> term
    Entering state 128
    Reading a token:
    Next token is token ',' (0x1966e58)
    
    index:        2        3        4        5        6        7        8        9
    state:        8       15      103       68      185       49       39      128
    token:       @1 remember  stmtseq    amper      '('      '!'    UNIOP     term
    value:        0       22 (Nullop)    rv2cv 26635928 26635928      272    const
    
    Reducing stack by rule 199 (line 999), UNIOP term -> term
    Entering state 150
    Next token is token ',' (0x1966e58)
    
    index:        1        2        3        4        5        6        7        8
    state:        1        8       15      103       68      185       49      150
    token: GRAMPROG       @1 remember  stmtseq    amper      '('      '!'     term
    value:        0        0       22 (Nullop)    rv2cv 26635928 26635928     ftis
    
    Reducing stack by rule 148 (line 829), '!' term -> termunop
    Entering state 62
    
    index:        1        2        3        4        5        6        7
    state:        1        8       15      103       68      185       62
    token: GRAMPROG       @1 remember  stmtseq    amper      '(' termunop
    value:        0        0       22 (Nullop)    rv2cv 26635928      not
    
    ...
    

    In other words, the parser reads in

    ( ! -e "foo"
    

    reduces -e "foo" to a term, and then adds a logical negation opcode to the parse tree. The operand is -e "foo" in scalar context.


    Here's what the parser does with not:

    $ perl -Dpv -e'ok(not -e "foo", "bar")'
    ...
    
    Reading a token:
    Next token is token '(' (0x26afed8)
    Shifting token '(', Entering state 185
    Reading a token:
    Next token is token NOTOP (0x26afed8)
    Shifting token NOTOP, Entering state 48
    Reading a token:
    Next token is token UNIOP (0x110)
    Shifting token UNIOP, Entering state 39
    Reading a token:
    Next token is token THING (0x26afe98)
    Shifting token THING, Entering state 25
    
    index:        2        3        4        5        6        7        8        9
    state:        8       15      103       68      185       48       39       25
    token:       @1 remember  stmtseq    amper      '('    NOTOP    UNIOP    THING
    value:        0       22 (Nullop)    rv2cv 40566488 40566488      272    const
    
    Reducing stack by rule 184 (line 961), THING -> term
    Entering state 128
    Reading a token:
    Next token is token ',' (0x26afe98)
    
    index:        2        3        4        5        6        7        8        9
    state:        8       15      103       68      185       48       39      128
    token:       @1 remember  stmtseq    amper      '('    NOTOP    UNIOP     term
    value:        0       22 (Nullop)    rv2cv 40566488 40566488      272    const
    
    Reducing stack by rule 199 (line 999), UNIOP term -> term
    Entering state 65
    Next token is token ',' (0x26afe98)
    
    index:        1        2        3        4        5        6        7        8
    state:        1        8       15      103       68      185       48       65
    token: GRAMPROG       @1 remember  stmtseq    amper      '('    NOTOP     term
    value:        0        0       22 (Nullop)    rv2cv 40566488 40566488     ftis
    
    Reducing stack by rule 105 (line 683), term -> listexpr
    Entering state 149
    Next token is token ',' (0x26afe98)
    Shifting token ',', Entering state 162
    Reading a token:
    Next token is token THING (0x26afdd8)
    Shifting token THING, Entering state 25
    
    index:        3        4        5        6        7        8        9       10
    state:       15      103       68      185       48      149      162       25
    token: remember  stmtseq    amper      '('    NOTOP listexpr      ','    THING
    value:       22 (Nullop)    rv2cv 40566488 40566488     ftis 40566424    const
    
    Reducing stack by rule 184 (line 961), THING -> term
    Entering state 249
    Reading a token:
    Next token is token ')' (0x26afdd8)
    
    index:        3        4        5        6        7        8        9       10
    state:       15      103       68      185       48      149      162      249
    token: remember  stmtseq    amper      '('    NOTOP listexpr      ','     term
    value:       22 (Nullop)    rv2cv 40566488 40566488     ftis 40566424    const
    
    Reducing stack by rule 104 (line 678), listexpr ',' term -> listexpr
    Entering state 149
    Next token is token ')' (0x26afdd8)
    
    index:        1        2        3        4        5        6        7        8
    state:        1        8       15      103       68      185       48      149
    token: GRAMPROG       @1 remember  stmtseq    amper      '('    NOTOP listexpr
    value:        0        0       22 (Nullop)    rv2cv 40566488 40566488     list
    
    Reducing stack by rule 196 (line 993), NOTOP listexpr -> term
    Entering state 65
    Next token is token ')' (0x26afdd8)
    
    index:        1        2        3        4        5        6        7
    state:        1        8       15      103       68      185       65
    token: GRAMPROG       @1 remember  stmtseq    amper      '('     term
    value:        0        0       22 (Nullop)    rv2cv 40566488      not
    
    ...
    

    In other words, the parser reads in

    ( not -e "foo"
    

    reduces -e "foo" to a term, reads in

    , "bar"
    

    reduces term, "bar" to a listexpr, and then adds a logical negation opcode to the parse tree. The operand is -e "foo", "bar" in scalar context.


    So, even though the opcodes for the two logical negations are the same, their operands are different. You can see this by inspecting the generated parse trees:

    $ perl -MO=Concise,-tree -e'ok(! -e "foo", "bar")'
    <a>leave[1 ref]-+-<1>enter
                    |-<2>nextstate(main 1 -e:1)
                    `-<9>entersub[t1]---ex-list-+-<3>pushmark
                                                |-<6>not---<5>ftis---<4>const(PV "foo")
                                                |-<7>const(PV "bar")
                                                `-ex-rv2cv---<8>gv(*ok)
    -e syntax OK
    $ perl -MO=Concise,-tree -e'ok(not -e "foo", "bar")'
    <c>leave[1 ref]-+-<1>enter
                    |-<2>nextstate(main 1 -e:1)
                    `-<b>entersub[t1]---ex-list-+-<3>pushmark
                                                |-<9>not---<8>list-+-<4>pushmark
                                                |                  |-<6>ftis---<5>const(PV "foo")
                                                |                  `-<7>const(PV "bar")
                                                `-ex-rv2cv---<a>gv(*ok)
    -e syntax OK
    

    With !, the negation acts on the file test:

    |-<6>not---<5>ftis
    

    While with not, the negation acts on a list:

    |-<9>not---<8>list
    

    You can also dump the parse tree as Perl code using B::Deparse, which shows the same thing in a different format:

    $ perl -MO=Deparse,-p -e'ok(! -e "foo", "bar")'
    ok((!(-e 'foo')), 'bar');
    -e syntax OK
    $ perl -MO=Deparse,-p -e'ok(not -e "foo", "bar")'
    ok((!((-e 'foo'), 'bar')));
    -e syntax OK
    

    With !, the negation acts on the file test:

    !(-e 'foo')
    

    While with not, the negation acts on a list:

    !((-e 'foo'), 'bar')
    

    And as toolic explained, a list in scalar context evaluates to the last item in the list, giving

    ok( ! 'bar' );
    

    where ! 'bar' is falsey.