I am currently trying to translate Python's formal grammar (https://docs.python.org/3/reference/grammar.html) into Rail Diagrams. The website we are using http://www.bottlecaps.de/rr/ui is very helpful for most of it and we have changed many things by hand to fir the proper notation for it to create the rail diagram but there is still 50+ lines that are incorrect and very hard for us to fix as we are brand new to this. Is there an easier way to do this than changing it all by hand?
Note the website uses EBNF
Thanks for your time,
Write a parser that parses the grammar, then transform from parse-tree to required notation.
The transformation itself is fairly simple:
A suitable meta-grammar, in W3C notation, is
Grammar ::= Rule+ EOF
Rule ::= Nonterminal ':' Alternatives
Alternatives
::= Alternative ( '|' Alternative )*
Alternative
::= ( Symbol ( '*' | '+' )? )*
Symbol ::= Nonterminal
| Terminal
| '(' Alternatives ')'
| '[' Alternatives ']'
<?TOKENS?>
Nonterminal
::= [a-z] [a-z_]*
Terminal ::= [A-Z] [A-Z_]*
| "'" [^']+ "'"
EOF ::= $
IgnorableWhitespace
::= [ #x9#xA#xD]+
| '#' [^#xA]* [#xA]
/* ws: definition */
Put it in grammar.ebnf
, then use REx to create a parser for it, coded e.g. in XQuery, using this command line:
-xquery -tree
This gives you XQuery module grammar.xquery
. Next, put the python grammar in python.grammar
, and this XQuery program in transform.xquery
:
import module namespace p="grammar" at "grammar.xquery";
declare option saxon:output "method=text";
declare variable $input as xs:string external;
for $token in p:parse-Grammar(unparsed-text($input))//text()
return
if (starts-with(normalize-space($token), "#")) then
replace($token, "((^|
)[\s])*#", "$1//")
else
switch ($token)
case ":" return "::="
case "[" return "("
case "]" return ")?"
default return $token
Then use Saxon to run it:
java net.sf.saxon.Query transform.xquery input=python.grammar > python.ebnf
The result is what you were looking for.
Of course you can also use your favorite text editor to carefully do global replaces achieving the same. It's just so much more fun to do it right.