Search code examples
ccompiler-construction

How does a compiler (C/C++) identify a comment?


If my program has a string s1= "like/*this" , s2="like /*this is a comment */this" and s3 = "like //this is not a comment" In s1 and s3, "/" and "//*" are part of the string. In s2, It is a comment for the users to be displayed on the output screen. What algorithm does the c/c++ compiler use for this? (My guess is, the compiler just ignores all text inside "")


Solution

  • No, inside strings there are no comments, all the characters are part of the string. From the C standard, chapter 6.4.9 (Comments):

    Except within a character constant, a string literal, or a comment, the characters /* introduce a comment. The contents of such a comment are examined only to identify multibyte characters and to find the characters */ that terminate it.

    Then a similar rule for the // comments.

    Also, there is a nice foot-note clarifying that since the /* is not recognized inside a comment, comments do not nest.

    About the algorithm used by compilers... well, when tokenizing the input file, the compiler knows if it is inside a string or not (it must know its own state), is it is easy to switch to comment mode or not.