Search code examples
pythongrammarebnf

EBNF Nested Optional/Grouping


I'm observing the python grammar listed in the manual and considering the outputs of their form of EBNF, specifically with varargslist:

varargslist: (vfpdef ['=' test] (',' vfpdef ['=' test])* [',' [
'*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '**' vfpdef [',']]]
| '*' [vfpdef] (',' vfpdef ['=' test])* [',' ['**' vfpdef [',']]]
| '**' vfpdef [',']

Though I'm specifically interested in this section:

['*' [vfpdef] (',' vfpdef ['=' test])* ]

Which I interpret as:

[ [ non-terminal1 ] ( non-terminal2) ]

I realize that both

non-terminal1 (non-terminal2)
(non-terminal2)

Are valid options in this form, but does this include:

non-terminal1

as well? The wiki page for EBNF states

That is, everything that is set within the square brackets may be 
present just once, or not at all

but does this group everything within the square brackets as one entity that may appear only once, or is the option selective, for example:

[ [non-terminal1] [(non-terminal2)] ]

Solution

  • If

    ['*' [vfpdef] (',' vfpdef ['=' test])* ]
    

    is

    [ [ non-terminal1 ] non-terminal2 ]    -- parentheses deleted as redundant
    

    then non-terminal2 represents

    non-terminal3 *
    

    which is nullable by definition. (That is, it might be empty.)

    So, strictly speaking, once you've done the transform

    non-terminal1
    

    is not a valid outcome. The parse must be

    non-terminal1 non-terminal2
    

    where non-terminal2 has matched an empty string.

    But the actual parse logic is more likely to want to use the formulation

    [ [ non-terminal1 ] non-terminal3... ]   -- Not EBNF syntax, but I hope you get the idea
    

    in which non-terminal2 has been eliminated as a distraction from the resulting parse. In this case, since the 0-or-more repetition can be 0 repetitions, correct outcomes would include

                                              -- nothing :-)
    non-terminal1
                  non-terminal3
    non-terminal1 non-terminal3
                  non-terminal3 non-terminal3
    

    and so on.