Search code examples
bisonflex-lexer

What S/L/l stands for in YYSTYPE/YYLTYPE/yylval/yylloc?


In flex/bison, there are two data types and corresponding static variables:

  • YYSTYPE/yylval
  • YYLTYPE/yylloc

I am wondering what S/L/l stands?

My guess is:

  • S stands for symbol (i.e., symbol's semantic data types)
  • L stands for location, and
  • l stands for 'lexer' (meaning a variable to be shared with the lexer).

Solution

  • Questions of the form "why is this historical name spelled X?" are almost always unanswerable since it is hard to go back in time 30 or more years to find whoever first thought up the variable name and ask them what they were thinking about. Even if they are still alive, they might not now remember their original chain of thought.

    It might be reasonable to ask a related question, "What mnemonic device can I use to keep these strange names straight in my head?" Of course, such a question would necessarily be culture-specific since a good mnemonic device for a first-language English speaker is not necessarily good for someone whose first language is Greek, for example. However, leaving that aside, here are my thoughts (with some small historic notes):

    • yylval has been in Yacc since the beginning, as far as I know. Originally, it was paired with another externally-visible variable, yyval: yylval was the semantic value "returned" by the lexical scanner, and yyval was the semantic value generated by the production rule's semantic action (that is, what $$ translated to). Thus, yylval is the (semantic value) of the lookahead token, and I'm pretty sure that's where the first l comes from. Even if it isn't the historical meaning, it's a reasonable mnemonic. (Unfortunately, the lexical type of the lookahead symbol is yychar rather than yyltype, so the mnemonic is not perfect.)

    • I've always recommended thinking of YYSTYPE as meaning "Semantic TYPE", since the bison manual refers to "semantic values" produced by "semantic actions". I think the use of the adjective "semantic" here is also common in other literature. It's possible that the origin of the S came from "stack" (as in "the type of the value stack") but since the parser has several stacks, that's not a very useful mnemonic.

    • Bison added location information to the parsing model, which meant that there needed to be another global variable with another datatype used to pass location information from the lexical scanner to the parser. It seems pretty clear that YYLTYPE yylloc; was produced by analogy with yylval, and indeed inside the bison-generated parser there is a local variable called yyloc which plays a role analogous to yyval. So the L in YYLTYPE can definitely be thought of as meaning "Location TYPE", while the first l in yylloc is similar to the first l in yylval, indicating the location of the lookahead token.