Search code examples
scalaapache-sparkapache-zeppeliniqr

Apache Zeppelin Not Showing Full Stack Trace


I have the following Paragraph that does some Outlier detection using the InterQuartileRange method and strangely it runs in an error, but Apache Zeppelin is kind of truncating it to be useful.

Here is the code:

def interQuartileRangeFiltering(df: DataFrame): DataFrame = {
    @tailrec
    def inner(cols: Seq[String], acc: DataFrame): DataFrame = cols match {
      case Nil          => acc
      case column :: xs =>
        val quantiles = acc.stat.approxQuantile(column, Array(0.25, 0.75), 0.0) // TODO: values should come from config
        val q1 = quantiles(0)
        val q3 = quantiles(1)
        val iqr = q1 - q3
        val lowerRange = q1 - 1.5 * iqr
        val upperRange = q3 + 1.5 * iqr
        inner(xs, acc.filter(s"$column < $lowerRange or value > $upperRange"))
    }
    inner(df.columns.toSeq, df)
  }

Here is the error when run in Apache Zeppelin:

scala.MatchError: WrappedArray(NEAR BAY, ISLAND, NEAR OCEAN, housing_median_age, population, total_bedrooms, <1H OCEAN, median_house_value, longitude, INLAND, latitude, total_rooms, households, median_income) (of class scala.collection.mutable.WrappedArray$ofRef)
  at inner$1(<console>:74)
  at interQuartileRangeFiltering(<console>:85)
  ... 56 elided

I have indeed verified the corresponding setting in the spark interpreter to true:

zeppelin.spark.printREPLOutput

Any ideas as to what is wrong here with my approach and how to get Apache Zeppelin to print the whole stacktrace so that I can find out what the actual problem is?


Solution

  • As a workaround, you can see full stack trace via next snippet:

    lastException.printStackTrace(System.out)
    

    You can also wrap your code with try/catch to do the same.

    try {
        // code
    } catch {
        case e: Throwable => e.printStackTrace(System.out)
    }