Search code examples
parsingcompiler-constructionbisonyacc

How can I make a yacc rule that recognizes a comment in C?


I'm trying to make a rule in yacc that recognizes comments. For this, I have defined one token for the comments of only 1 line an another one for the multi line ones; in the lex file:

comentario              [//]([^"])* [\n]
comentarioMulti         [/*]([^"])*[*/]

And also a rule for it in the yacc file :

comentario:                     COMENTARIO
                                COMENTARIMULTILINEA
                                ;

But it gives me this error:

 syntax error en la línea 26
 COMENTARIO MacBook-Air-de-administrador:Sintactico administrador$ 

I have also tried by putting the \n without the [] and some other options, but I get the same error every time.


Solution

  • When a regex doesn't work as expected, I recommend to test it piece by piece to see if they do what is expected. For example, if you'd checked what [//] does, you would find out that it matches / instead of //. (There are also regex visualizers online that can help.)

    Let's go over problems with your code:

    • [ and ] aren't meant to escape special characters. They do that as a side effect but their main purpose is to create a character class. [/*] doesn't match /*. It matches either / or *. To escape characters use either quotation marks ("/*") or backslashes (\/\*). If you really want to use square brackets, you need to put each character into a separate pair ([/][*]).
    • Your first regex contains a space. Unescaped spaces are not allowed. (Unless you use the special group that allows and ignores them (?x: ... ).)
    • I don't understand why do you not allow quotation marks in your comments. ([^"] means "everything but "".) If you have your reason that's OK but it seems suspicious to me.
    • The regex for multi-line comment needs to be a lot more complicated if you don't want it to match things like /* this is a comment */ some code /* another comment */. You can check the links I've commented under the question, to see how to write it correctly.