Search code examples
dataframescalaapache-spark

How to dynamically modify columns in a scala spark dataframe?


I have a dataframe

val dat = Seq((1, 2, 3), (3, 4, 5)).toDF("a", "b", "c")
scala> dat.show()
+---+---+---+
|  a|  b|  c|
+---+---+---+
|  1|  2|  3|
|  3|  4|  5|
+---+---+---+

and a sequence of column names

val colsToModfiy = Seq("a", "b")

I want to multiply values by ten in the dataframe where column names match any value in my sequence.

I expect the following result:

+---+---+---+
|  a|  b|  c|
+---+---+---+
| 10| 20|  3|
| 30| 40|  5|
+---+---+---+

My gut feeling says it should be possible using a foldLeft() but I can't seem to figure it out.


Solution

  • Your gut is right.

    colsToModfiy.foldLeft(dat) { case (df, c) => df.withColumn(c, col(c) * 10) }.show()