Search code examples
scalaparsinggrammarmatchingparser-combinators

Scala: custom grammar/parser combinators


I write a Scala program which reads a string from the user, and uses either a recursive descent parser or parser combinator to determine if the input string matches the below grammar (i.e., is made up of a’s and b’s), while building a parse tree along the way. And then output the generated tree if the match is successful.

Grammar:

S -> E$
E -> C E2
E2 -> E
E2 -> NIL
C -> 'a' | 'b'

I am fairly new to Scala so any reading will be much appreciated, If you have any ideas please let me know how I can implement this, Thank you.

This is what I have currently

Code I already have:

class MPParser extends JavaTokenParsers{
def c[C] = (“a” | “b”) ^^ {case ch => C(ch)}
}

abstract class MatchTree
case class C(s:String) extends MatchTree

The output should look something like this:

scala> Microproject.main(Array("ababa"))
input : ababa
[1.6] parsed: S(E(C(a),E(C(b),E(C(a),E(C(b),E(C(a),NIL()))))))

 scala> Microproject.main(Array("ababac"))
 input : ababac
[1.6] failure: `b' expected but `c' found

ababac
^

Solution

  • Production rule def c[C] = ... looks strange. You probably meant def c: Parser[C] = ... instead.

    The following code will show you how to define production rules and build a custom parse tree using scala-parser-combinators:

    import scala.util.parsing.combinator.RegexParsers
    
    case class S(e: E)
    
    // E2 -> E | NIL
    sealed abstract class E2
    case class E(c: C, e2: E2) extends E2
    case object NIL extends E2 { override def toString = "NIL()" }
    
    case class C(aOrB: String)
    
    class MPParser extends RegexParsers {
      // S -> E
      def s: Parser[S] = e ^^ { S(_) }
      // E -> C E2
      def e: Parser[E] = c ~ e2 ^^ { case c ~ e2 => E(c, e2) }
      // E2 -> E | NIL
      def e2: Parser[E2] = opt(e) ^^ { _.getOrElse(NIL) }
      // C -> 'a' | 'b'
      def c: Parser[C] = ("a" | "b") ^^ { C(_) }
    }
    
    object Microproject extends App {
      val parser = new MPParser
      for (arg <- args) {
        println("input : " + arg)
        println(parser.parseAll(parser.s, arg))
      }
    }
    

    Live on Scastie