If my program has a string s1= "like/*this" , s2="like /*this is a comment */this" and s3 = "like //this is not a comment" In s1 and s3, "/" and "//*" are part of the string. In s2, It is a comment for the users to be displayed on the output screen. What algorithm does the c/c++ compiler use for this? (My guess is, the compiler just ignores all text inside "")
No, inside strings there are no comments, all the characters are part of the string. From the C standard, chapter 6.4.9 (Comments):
Except within a character constant, a string literal, or a comment, the characters /* introduce a comment. The contents of such a comment are examined only to identify multibyte characters and to find the characters */ that terminate it.
Then a similar rule for the //
comments.
Also, there is a nice foot-note clarifying that since the /*
is not recognized inside a comment, comments do not nest.
About the algorithm used by compilers... well, when tokenizing the input file, the compiler knows if it is inside a string or not (it must know its own state), is it is easy to switch to comment mode or not.