Search code examples
algorithmparsinggrammarcontext-free-grammarebnf

How can I transform an Extended Backus Naur Grammar to its normal representation?


I have a grammar that contains expressions between brackets '{}' which represents 0 or more times that expression, and expressions between square brackets '[]', which represent 1 or none times that expression, I think this kind of grammars are called Extended Backus-Naur Form Grammars. I would like to transform the grammar to its normal form (where there are not brackets nor square brackets).

Is there an existing algorithm to do that?

I know that I can substitute something like A--> B[CD]E to A-->BE, A--> BCDE but I would like to know if there are existing algorithms that I could implement in order to transform those expressions.


Solution

  • The most straightforward way to do that is to replace every EBNF construct with a new rule. Here are the equivalences you can use:

    Option

    A ::= B [C D] E ;
    
    A ::= B X E ;
    X ::= C D | ɛ ;
    

    Where ɛ represents the empty string.

    Repetition

    A ::= B {C D} E ;
    

    Zero or more times:

    A ::= B X E ;
    X ::= C D X | ɛ ;
    

    One or more times:

    A ::= B X E ;
    X ::= C D | C D X ;
    

    Grouping

    A ::= B (C D) E ;
    
    A ::= B X E ;
    X ::= C D ;
    

    Apply these transformations recursively and you'll end up with vanilla BNF.