Search code examples
c++language-lawyercompatibilityterminologypunctuator

Which elements of operator-or-punctuator are punctuators?


I study C++ terminology. I have a trouble understanding the term "punctuator".

Consider, for example, https://eel.is/c++draft/lex.pptoken (emphasis added):

Each preprocessing token that is converted to a token shall have the lexical form of a keyword, an identifier, a literal, or an operator or punctuator.

This suggests that operators and punctuators are distinct entities. Is that correct?

Note: In C there are punctuators (in the first place), some of these punctuators are operators.

What is the list of all punctuators? In other words: which elements of operator-or-punctuator are punctuators?


Solution

  • An exhaustive list isn't particularly useful, so I'm not going to try to go there.

    But I'll give an example of the basic idea.

    A comma (,) can be either an operator or a punctuator. Here's an example of a comma as an operator:

    for (int i=0; i<10; i++, j--)
    

    i++, j-- is a single expression, and in this case, the comma is an instance of the comma operator, which evaluates its left operand, then its right operand, and (although the result is ignored in this case) the result is the value yielded by the right operand.

    You can also use a comma in a function declaration, definition, or call:

    int f(int, double); // declaration
    int g(long a, float b) { // definition
        return a + b;
    }
    
    x = f(1, 2); // call
    

    In these cases, the comma is simply acting as punctuation--we have a list of items with commas to separate the items. Unlike a comma operator, in this case the comma doesn't imply anything about order of evaluation, and doesn't yield a value. It's just a separator between items in a list.

    There are a number of other places a comma can be used as a punctuator, such as when defining an array:

    int i[] = { 1, 2, 3, 4, 5 };
    

    Again, this isn't a comma operator--it's just a list of items with commas to separate them.

    Additionally, I'd note the punctuation of the sentence you were looking at:

    Each preprocessing token that is converted to a token shall have the lexical form of a keyword, an identifier, a literal, or an operator or punctuator."

    Note that there are commas (!) between the other entries (e,g, "a keyword, an identifier"), but not between "operator or punctuator". This is because lexically, C and C++ treat "operator or punctuator" as a single class of things. So when the lexer sees a comma, it doesn't try to sort out whether this particular comma is an operator or a punctuator. It's not until you get to the parser that you try to decide between those.