parsing comments context-free-grammar ebnf

Is it possible to describe block comments using EBNF?

Say, I have the following EBNF:

document    = content , { content } ;
content     = hello world | answer | space ;
hello world = "hello" , space , "world" ;
answer      = "42" ;
space       = " " ;

This lets me parse something like:

hello world 42

Now I want to extend this grammar with a block comment. How can I do this properly?

If I start simple:

document    = content , { content } ;
content     = hello world | answer | space | comment;
hello world = "hello" , space , "world" ;
answer      = "42" ;
space       = " " ;
comment     = "/*" , ?any character? , "*/" ;

I cannot parse:

Hello /* I'm the taxman! */ World 42

If I extend the grammar further with the special case from above, it gets ugly, but parses.

document    = content , { content } ;
content     = hello world | answer | space | comment;
hello world = "hello" , { comment } , space , { comment } , "world" ;
answer      = "42" ;
space       = " " ;
comment     = "/*" , ?any character? , "*/" ;

But I still cannot parse something like:

Hel/*p! I need somebody. Help! Not just anybody... */lo World 42

How would I do this with an EBNF grammar? Or is it not even possible at all?

Solution

Assuming you would consider "hello" as a token, you would not want anything to break that up. Should you need to do so, it becomes necessary to explode the rule:

hello_world = "h", {comment}, "e", {comment}, "l", {comment}, "l", {comment}, "o" ,
              { comment }, space, { comment },
              "w", {comment}, "o", {comment}, "r", {comment}, "l", {comment}, "d" ;

Considering the broader question, it seems commonplace to not describe language comments as part of the formal grammar, but to instead make it a side note. However, it can generally be done by treating the comment as equivalent to whitespace:

space = " " | comment ;

You may also want to consider adding a rule to describe consecutive whitespace:

spaces = { space }- ;

Cleaning up your final grammar, but treating "hello" and "world" as tokens (i.e. not allowing them to be broken apart), could result in something like this:

document    = { content }- ;
content     = hello world | answer | space ;
hello world = "hello" , spaces , "world" ;
answer      = "42" ;
spaces      = { space }- ;
space       = " " | comment ;
comment     = "/*" , ?any character? , "*/" ;