Search code examples
phpparsingstring-formatting

PHP: Why is "${obj->prop}" a parsing error, while "$obj->prop" is legal?


PHP 7.3.2

When parsing variables inside PHP’s double quoted strings, there is one parsing error that I find peculiar:

echo "${obj->prop}"; // Parse error: syntax error, unexpected '->' (T_OBJECT_OPERATOR)

// But this is legal:
echo "$obj->prop";

// And, for instance, all these are legal as well:
echo "${arr['key']}";
echo "${arr[0]}";
echo "${arr['0']}";

Why is the interpreter prejudiced specifically against -> within ${…}?


Solution

  • First, curly braces like that are not an alternate form of complex variable parsing syntax.

    "$var" and "${var}" are simple syntax, and "{$var}" is complex syntax.

    In simple syntax, the interpreter is strictly looking for a variable name, not an expression, and curly braces are used only used to indicate the end of the name, in case you have something like "${var}othertext". The manual states

    If a dollar sign ($) is encountered, the parser will greedily take as many tokens as possible to form a valid variable name. Enclose the variable name in curly braces to explicitly specify the end of the name.

    The interpreter is not prejudiced specifically against -> within ${…}, actually the interpreter is very strict about what it considers a valid variable name within ${…} and makes one exception for accessing a single array key.

    You can't do "${var['a']['b']}", for example.

    During the first step in interpreting the code, when the string is being tokenized, ${ is a token T_DOLLAR_OPEN_CURLY_BRACES, that sets the scanner to a "looking for variable name" state. In that state, the only thing that will be recognized as a variable name is a valid label followed by an open square bracket or the closing curly brace.

    Tokenizing the array key example looks like this:

    source: "           ${                      arr         [             'key'             ]  }  "
    tokens: " T_DOLLAR_OPEN_CURLY_BRACES  T_STRING_VARNAME  [  T_CONSTANT_ENCAPSED_STRING   ]  }  "
    

    And the object property example looks like this:

    source: "           ${                     obj            ->             prop    }  "
    tokens: " T_DOLLAR_OPEN_CURLY_BRACES    T_STRING   T_OBJECT_OPERATOR   T_STRING  }  "
    

    The parse error happens in the next step, where the object operator is applied to a string, where it is unexpected.

    Using simple syntax without the curly braces, you get these tokens instead, which work just fine as you know:

    source: "     $obj            ->            prop   "
    tokens: "  T_VARIABLE  T_OBJECT_OPERATOR  T_STRING "