Search code examples
scalascalacheck

ScalaCheck generates StackOverflowError


I want to create generators (for ScalaCheck) to generate propositional formula. If I create a generator to generate formulas with the variable and and-logic operator (example: A and B) all is right. But if i add or, implies and not, ScalaCheck generates Exception: java.lang.StackOverflowError.

import org.scalacheck.Prop
import org.scalacheck.Properties
import org.scalacheck.Gen

object FormulaWffSpecification extends Properties("FormulaWff") {

  abstract class FormulaWff {
    def size: Int
    def depths: Int
  }
  case class And(left: FormulaWff, right: FormulaWff)     extends FormulaWff
  case class Or(left: FormulaWff, right: FormulaWff)      extends FormulaWff
  case class Implies(left: FormulaWff, right: FormulaWff) extends FormulaWff
  case class Not(son: FormulaWff)                         extends FormulaWff
  case class Var(string: String)                          extends FormulaWff

  val genAnd = for {
    left  <- myGenFormula
    right <- myGenFormula
  } yield And(left, right)

  val genOr = for {
    left  <- myGenFormula
    right <- myGenFormula
  } yield Or(left, right)

  val genImplies = for {
    left  <- myGenFormula
    right <- myGenFormula
  } yield Implies(left, right)

  val genNot = for {
    son <- myGenFormula
  } yield Not(son)

  val genVar = Gen.oneOf(Var("A"),Var("B"))

  def myGenFormula: Gen[FormulaWff] =
    Gen.lzy(Gen.oneOf(genVar, genAnd, genImplies, genOr, genNot))

  property("size(t) <= 2^(depths(t) + 1) - 1") = {
    Prop.forAll(myGenFormula) { f: FormulaWff =>
      f.size <= Math.pow(2, f.depths + 1) - 1
    }
  }
}

Solution

  • The obvious intuition is that the definition for myGenFormula is recursive. That would be the explanation for the stack overflow.

    One part of solving it is making sure to add Gen.lzy in the correct places for myGenFormula. This makes sure to execute a single path of generators and avoids needlessly executing all of the recursive generators.

    There is a secondary issue with the definition for myGenFormula that will cause a stack overflow. As written, the definition for myGenFormula is statistically unlikely to terminate. The genVar generator is the terminating generator, but it is equally weighted with the other non-terminating generators. There is nothing to keep ScalaCheck from generating an infinite depth of data structures before reaching a stack overflow.

    There are two ways to help recursive generators terminate in ScalaCheck. You can pass a numeric depth argument with Gen.sized to your generators, or you could use Gen.frequency.

    There is also an initialization issue with the mutually recursive generators. You need to use the lazy keyword on the generator vals that reference myGenFormula to avoid that.

    Here's a solution that sprinkles lazy, Gen.lzy and Gen.frequency in your code so that it runs and terminates. You will probably need to tune to your testing needs.

    lazy val genAnd = for {
      left <- myGenFormula
      right <- myGenFormula
    } yield And(left, right)
    
    lazy val genOr = for {
      left <- myGenFormula
      right <- myGenFormula
    } yield Or(left, right)
    
    lazy val genImplies = for {
      left <- myGenFormula
      right <- myGenFormula
    } yield Implies(left, right)
    
    lazy val genNot = for {
      son <- myGenFormula
    } yield Not(son)
    
    val genVar =
      Gen.oneOf(Var("A"), Var("B"))
    
    val myGenFormula: Gen[FormulaWff] =
      Gen.frequency(
        4 -> genVar,
        1 -> Gen.lzy(genAnd),
        1 -> Gen.lzy(genImplies),
        1 -> Gen.lzy(genOr),
        1 -> Gen.lzy(genNot))
    
    property("myGenFormula") = {
      Prop.forAll(myGenFormula) { f: FormulaWff =>
        true
      }
    }