I wonder if there is any way to know if a character is an other character or space character beforehand in EBNF? Right now I have lexed every possible variant at each position in the source string, but it gives me a little headache to have to try all possible interpretations, especially if I have to try all possible production rules as well before knowing if it is an other character or space character.
To clarify: spacebar, ' ', is both space character and other character if one looks in ISO/IEC 14977, I wanted to know if it was possible to check which one it is easier than brute forcing every possible interpretation of the source string.
2018-01-06: Perhaps the ambiguity can be resolved by 6.1? The text implicitly says that gap-separators has higher priority than other-characters outside terminal strings, because otherwise they would be a part of the syntax? Or perhaps it defines an equivalence class of syntaxes, modulo space character, or something like that...
I wonder if there is any way to know if a character is an other character or space character beforehand in EBNF?
Yes, an other-character
(including space) may appear in a terminal-string
(4.17, 4.18), special-sequence
(4.20), or bracketed-textual-comment
(6.6). Other than that, a space
is a gap-separator
(6.4, 7.6).
This may be seen by substituting a different other-character
, such as #
for space
. In the cases mentioned: terminal-string
, special-sequence
, and bracketed-textual-comment
; there is no effective change to the automated processing of the EBNF--though the results are undesirable. However, substituting #
for space
in a gap-separator
will show as errors in the automated processing of the EBNF.
Perhaps the ambiguity can be resolved by 6.1?
No, 6.1 expresses an intent but has no definitions or rules.
Consider that 6.2 defines terminal-character
to include other-character
. This means that each of #
and space
is a terminal character
. In 6.3, terminal-character
is a gap-free-symbol
, but #
, unlike the other symbols in 6.2, has no meaning in the standard. Furthermore, in 6.3 and 6.4, space
is both a gap-free-symbol
and a gap-separator
. The inclusion of terminal-character
in 6.3 appears to be a defect in the standard, but not the only one.
In 8.1, "The syntax of Extended BNF", there are some defects.
The following is not defined in 6.5:
(* see 6.5 *) syntax
= (gap separator},
gap free symbol, {gap separator},
{gap free symbol, {gap separator}};
There is no 6.9 for the following:
(* see 6.9 *) syntax
= {bracketed textual comment},
commentless symbol,
{bracketed textual comment},
{commentless symbol,
{bracketed textual comment)};
References to 6.6 through 6.8 are incorrectly numbered and should be 6.5 through 6.7, respectively.