Search code examples
grammarabnf

Preventing duplicate characters in ABNF


I want to create an ABNF rule that contains the characters "imsxeADSUXju". Each character is optional. The order does not matter, but a character may not appear more than once.

E.g.: it must match

"i" "im" "mi" "" "uUsejXx" "imsxeADSUXju"

But not match

"iim" "UmUu" "imsss"

I created the following rule, but it does not prevent a character appearing more than once:

options = 0*12( "i" / "m" / "s" / "x" / "e" / "A" / "D" / "S" / "U" / "X" / "j" / "u" )

In this rule the order matters:

options = [ "i" ] [ "m" ] [ "s" ] [ "x" ] [ "e" ] [ "A" ] [ "D" ] [ "S" ] [ "U" ] [ "X" ] [ "j" ] [ "u" ]

How can I write a rule that ignores order but also prevents doubles?


Solution

  • You have to write out all the alternatives unfortunately. Ignoring that ABNF is case-insensitive, it would have to be

      S        = "i" after-i /
                 "m" after-m /
                 ...
      after-i  = "m" after-im /
                 "s" after-is /
                 ...
      after-im = "s" after-ims /
                 ...
    

    The language here is regular, and if you consider how the minimal DFA for the language would look like, it would have to have 2^12 states and each state would correspond to an element of the powerset of the alphabet, encoding which of the characters has been seen already.