Take a look at the definition below. What exactly is this supposed to define? According to the EBNF specification, brackets []
define an optional item, so why is the *
required? Isn't that superfluous (since it means a repetition of zero or more times)?
The second thing is, how do you interpret the part within parentheses? The -
is the exclusion indicator, so does it mean excluding any of the items within parentheses, or the sequence of all three (zero or more from ^<&
, followed by ]]>
, followed by zero or more from ^<&
)?
CharData ::= [^<&]* - ([^<&]* ']]>' [^<&]*)
Or am I completely mistaken, and this is something other than EBNF?
Thanks in advance
The XML specification does not strictly use EBNF as specified by ISO. If you look at Section 6 of the XML specification, it defines the notation used. Square brackets are used in a regex-like manner, not to denote an optional element of the grammar; and the -
used for exclusion excludes the group within the parentheses as a whole. Thus, the line you quoted denotes builds up as follows:
[^<&]
- any character that is not a left angle bracket (<
) or an ampersand (&
)[^<&]*
- zero or more characters that are not left angle brackets or ampersands[^<&]* - ([^<&]* ']]>' [^<&]*)
- zero or more characters that are not left angle brackets or ampersands and which do not contain the particular sequence of characters ]]>
anywhere within the overall sequence