Search code examples
jsonparsinggrammarbnfebnf

Question about EBNF notation and JSON


Recently I've been studying parsers and grammars and how they work. I was reading over the formal grammar for JSON at http://www.ietf.org/rfc/rfc4627.txt, which uses EBNF. I was pretty confident in my understanding of BNF and EBNF, but apparently I still don't fully understand it. The RFC defines a JSON object like this:

  object = begin-object [ member *( value-separator member ) ]
  end-object

I understand that the intent here is to express that any JSON object can (optionally) have a member, and then be followed by 0 or more (value-separator, member) pairs. What I don't understand is why the asterisk appears before the (value-separator member). Isn't the asterisk supposed to mimic regex, so that it appears after the item to be repeated 0 or more times? Shouldn't the JSON object grammar be written like this:

  object = begin-object [ member ( value-separator member )* ]
  end-object

Solution

  • Syntax is about the way somebody chooses to write down concrete entities to represent something.

    I'll agree that puttting Kleene star before the entity to repeated is non-standard, and the authors choice to do that simply confuses people that are used to convention. But it is perfectly valid; the authors get to define what syntax means, and you, the user of the standard, just get to accept it.

    There's some argument for putting the Kleene star where he did; it indicates that there is list following at a point where you might expect a list. The suffix-style Kleene star indicates the same, but it is sort of a surprise; first you read the list element (from left to right), then you discover the star.

    As a practical matter, the surprise factor of post-Kleene-star isn't enough in general to outweigh the surprise factor of violating convention. But the authors of that standard made their choice.

    Welcome to syntax.