I recently started looking into Clojure, so admittedly I do not have much experience nor have I read every book about it. Still, I am having a hard time figuring out how to expand the behavior of systems coded in Clojure.
To be more specific, for educational purposes, some time ago I implemented parsers in Scala for a family of small languages -- the NAND-CIRC language family, similar to how it is defined in “Introduction to Theoretical Computer Science”, by Boaz Barak. There is a pure version of the language, and then there are successive syntax sugars that can be added (inline functions, user-defined functions, if/else, and for).
In Scala -- using the parser combinators library -- I could define the basic grammar as a class, as shown below (the semantic actions are omitted for brevity).
import scala.util.parsing.combinator._
class NandCircParsers extends RegexParsers {
def program: Parser[Any] = block
def block: Parser[Any] = rep(command)
def command: Parser[Any] = opt(statement) ~ rep(eol)
def statement: Parser[Any] = assign
def assign: Parser[Any] = reference ~ "=" ~ funcall
def reference: Parser[Any] = outputPort | variable
def funcall: Parser[Any] = identifier ~ "(" ~ actualArgs ~ ")"
def actualArgs: Parser[Any] = repsep(expression, ",")
def expression: Parser[Any] = inputPort | reference
def variable: Parser[Any] = identifier
def inputPort: Parser[Any] = "X" ~ "[" ~ index ~ "]"
def outputPort: Parser[Any] = "Y" ~ "[" ~ index ~ "]"
def identifier: Parser[Any] = """[_a-zA-Z$][_a-zA-Z0-9'$]*""".r
def index: Parser[Any] = number
def number: Parser[Any] = """[-+]?[0-9]+""".r
}
Then I could create an instance of the class NandCircParsers
and use it to
process the vanilla version of the language.
But one other dialect of the language NAND-CIRD allows the use of inline
functions. For that, the production rule for expression
must change. In Scala
that is a matter of creating a subclass and overriding the method in question.
class NandCircInlineParsers extends NandCircParsers {
override def expression: Parser[Any] = inputPort | reference | funcall
}
And then I can instantiate the class NandCircInlineParsers
and use the new
dialect without having to rewrite all the grammar, i.e., the parser hierarchy.
For the dialect with if/else syntax sugar (only) the situation is similar.
class NandCircIfParsers extends NandCircParsers {
override def statement: Parser[Any] = assign | ifSttmt
def ifSttmt: Parser[Any] =
"if" ~ expression ~ ":" ~ block ~ opt("else" ~ ":" ~ block) ~ "end"
}
I just have to override the production rule of the grammar (method) that changed and add the new ones and I have a parser for the new dialect.
But this is in Scala. Now I am trying to achieve an equivalent result with Clojure.
I have implemented a parser combinators library that goes in the line of what
was done for the parsec.el
Emacs Lisp parser combinator
library, and for my needs, it is
working.
The parser for the vanilla version of the NAND-CIRC language becomes something like this.
(def EQU (token (literal "=")))
(def COMMA (token (literal ",")))
(def LPAR (token (literal "(")))
(def RPAR (token (literal ")")))
(def LBKT (token (literal "[")))
(def RBKT (token (literal "]")))
(def IN (token (literal "X")))
(def OUT (token (literal "Y")))
(def ident (token (regex #"[_a-zA-Z$][_a-zA-Z0-9'$]*")))
(def number (token (regex #"[-+]?[0-9]+")))
(def variable ident)
(def index number)
(def input (do* IN LBKT index RBKT))
(def output (do* OUT LBKT index RBKT))
(def expr (choice input output variable))
(def actual-args (sep-by expr COMMA))
(def funcall (do* ident LPAR actual-args RPAR))
(def formal-args (sep-by var COMMA))
(def reference (either output variable))
(def assign (do* reference EQU funcall))
(def command (do* (optional assign) (many eol)))
(def statement command)
(def program (many command))
Readability issues aside, my question is: how can I achieve the same level of code reuse that I have in Scala with Clojure? How can I turn this design modular enough that I could only change or add the necessary rules to get a new dialect of the language? Right now, all I can think about -- that does not involve implementing an object system with inheritance -- is to duplicate the entire grammar.
Does Clojure have any resources that can facilitate this?
Thanks.
This sounds very much like something that could be implemented using maps. The basic grammar NandCircParsers
could be
(def NandCircParsers
{:program program
:block block
:command command
:statement statement
... ...})
From this grammar, we can create an extended grammar NandCircIfParsers
that uses merge
to inherit from NandCircParsers
. Something like this, maybe:
(def NandCircIfParsers
(let [ifSttmt ...]
(merge NandCircParsers
{:statement (choice (:assign NandCircParsers) ifSttmt)
:ifSttmt ifSttmt})))