Search code examples
clojureebnfinstaparse

Unable write parser where the AST can be turned into Clojure code


Given the following example "~a{b=1}&(a{b=1}|a{b=1})|a{b=1}|a{b=1}" I have written the following parser using Instaparse

((insta/parser
  "
S = (group | exp)+
group = '~'? <'('> exp+ <')'> op?
exp = '~'? path <'='> (v | r) <'}'> op?
path = (p <'{'>)* p
op = '|' | '&'
<p> = #'[a-z]'
v = #'[a-z0-9]'
r = <'\\''> #'[^\\']*' <'\\''>
")
 "~a{b=1}&(a{b=1}|a{b=1})|a{b=1}|a{b=1}")

Running the above output the following output

[:S
 [:exp "~" [:path "a" "b"] [:v "1"] [:op "&"]]
 [:group
  [:exp [:path "a" "b"] [:v "1"] [:op "|"]]
  [:exp [:path "a" "b"] [:v "1"] [:op "|"]]
  [:exp [:path "a" "b"] [:v "1"]]
  [:op "|"]]
 [:exp [:path "a" "b"] [:v "1"]]]

However from this output I have a very hard time writing a transformation into Clojure expressions. To have a more straightforward transformation I would need something more like:

[:S
 [:op "|"
  [:op "&"
   [:exp "~" [:path "a" "b"] [:v "1"]]
   [:group
    [:op "|"
     [:exp [:path "a" "b"] [:v "1"]]
     [:op "|"
      [:exp [:path "a" "b"] [:v "1"]]
      [:exp [:path "a" "b"] [:v "1"]]]]]]
  [:exp [:path "a" "b"] [:v "1"]]]]

Given this structure it would be much easier to transform this in to Clojure.

How would your write a generic parser that can parse structures like the above and similar into an AST that can then be turned into Clojure code using a simple insta/transfrom?


Solution

  • I'd follow the example from Operator-precedence parser; this will give you "single" and/or terms, but that should be easy to trim/optimize away in your following steps.

    S = expr
    <expr> = and
    and = or ( <'&'> or ) *
    or= primary ( <'|'> primary ) *
    <primary> = ( group | not | term )
    <group> = <'('> expr <')'>
    not = <'~'> term
    term = #'a\\{b=[0-9]\\}'
    

    E.g.

    ((insta/parser
       "
       S = expr
       <expr> = and
       and = or ( <'&'> or ) *
       or= primary ( <'|'> primary ) *
       <primary> = ( group | not | term )
       <group> = <'('> expr <')'>
       not = <'~'> term
       term = #'a\\{b=[0-9]\\}'
       ")
        "~a{b=1}&(a{b=2}|a{b=3})|a{b=4}|a{b=5}")
    ; →
    ; [:S
    ;  [:and
    ;    [:or [:not [:term "a{b=1}"]]]
    ;    [:or
    ;       [:and [:or [:term "a{b=2}"] [:term "a{b=3}"]]]
    ;       [:term "a{b=4}"]
    ;       [:term "a{b=5}"]]]]