I need to iterate over symbols of production rules of form:
e.g: Input
<relational operator> ::= = | <> | < | <= | >= | > | in
<next constant definition> ::= <empty> | <next constant definition> ; <constant definition>
so I was in need to derive a regular expression to split the text. Here's what I have so far
(?:\s|^|\s<|^<)(?:.*?)(?:\s|$|\s>|>$)
the problem is re.findall()
doesn't produce my desired output
Expected output is:
[<relational operator>, ::=, =, |, <>, |, <, |, <=, |, >=, |, >, |, in]
[<next constant definition>, ::=, <empty>, |, <next constant definition>, ;, <constant definition>]
How about using something simple like <\w+(?:\s+\w+)*>|\S+
< \w+
(?: \s+ \w+ )*
>
|
\S+