Search code examples
regexsyntaxbnfebnfurn

Explain BNF syntax for NID in RFC 2141


I am having trouble understanding some BNF syntax from RFC2141.

The line is <NID> ::= <let-num> [ 1,31<let-num-hyp> ]. I think it means that <NID> is a symbol for a string, with constrained by two rules:

  1. The string must be begin with a single occurence of any of the <let-num> characters.
  2. This character may be followed by 0-31 occurrences* of any of the <let-num-hyp> characters.

Am I reading this correctly? Because, if I am, some of the implications are a bit confusing.

*equivalent to "optionally, 1-31 occurrences

The complete BNF syntax for a <NID> (Namespace Identifier) in RFC2141 is:

<NID>         ::= <let-num> [ 1,31<let-num-hyp> ]
<let-num-hyp> ::= <upper> | <lower> | <number> | "-"
<let-num>     ::= <upper> | <lower> | <number>
<upper>       ::= "A" | "B" | "C" | "D" | "E" | "F" | "G" | "H" |
                  "I" | "J" | "K" | "L" | "M" | "N" | "O" | "P" |
                  "Q" | "R" | "S" | "T" | "U" | "V" | "W" | "X" |
                  "Y" | "Z"
<lower>       ::= "a" | "b" | "c" | "d" | "e" | "f" | "g" | "h" |
                  "i" | "j" | "k" | "l" | "m" | "n" | "o" | "p" |
                  "q" | "r" | "s" | "t" | "u" | "v" | "w" | "x" |
                  "y" | "z"
<number>      ::= "0" | "1" | "2" | "3" | "4" | "5" | "6" | "7" |
                  "8" | "9"

Solution

  • You've interpreted it correctly. What are the confusing implications?

    <NID> ::= <let-num> [ 1,31<let-num-hyp> ]
    

    means one occurrence of <let-num> followed optionally by up to 31 occurrences of <let-num-hyp>.

    Taking into account the other definitions, this means a string of at least one character and at most 32 characters, consisting of letters of either case, numerals, and hyphens, with the first character not allowed to be a hyphen.