Search code examples
dataframescalafunctionpyspark

Illegal start of simple expression when calling scala function


I have a function declared outside the main method to melt a wide data frame that i got from this post How to unpivot Spark DataFrame without hardcoding column names in Scala?

def melt(preserves: Seq[String], toMelt: Seq[String], column: String = "variable", row: String = "value", df: DataFrame) : DataFrame = {
    val _vars_and_vals = array((for (c <- toMelt) yield { struct(lit(c).alias(column),   col(c).alias(row)) }): _*)
    val _tmp = df.withColumn("_vars_and_vals", explode(_vars_and_vals))
    val cols = preserves.map(col _) ++ { for (x <- List(column, row)) yield { col("_vars_and_vals")(x).alias(x) }}
    _tmp.select(cols: _*)
    }

I run into an illegal start of simple expression error when I try to call the function. I have too many columns to manually declare so I call the function this way

val selectColumns = questionDF.columns.toSeq
var df = questionDF
df.melt(preserves=["ID"], toMelt= selectColumns, questionDF)

I've also tried

df = melt(preserves=["ID"], toMelt= selectColumns, questionDF)

But I get an illegal start of simple expression on the line when I try to use the function either way. Not sure if the problem is with how I'm calling the function or the function itself


Solution

  • In Scala, you cannot declare a Seq with [...].

    Change to:

    df.melt(preserves=Seq("ID"), ...)