Search code examples
scalaapache-sparkdataframeapache-spark-sql

Scala Spark DataFrame : dataFrame.select multiple columns given a Sequence of column names


val columnName=Seq("col1","col2",....."coln");

Is there a way to do dataframe.select operation to get dataframe containing only the column names specified . I know I can do dataframe.select("col1","col2"...) but the columnNameis generated at runtime. I could do dataframe.select() repeatedly for each column name in a loop.Will it have any performance overheads?. Is there any other simpler way to accomplish this?


Solution

  • val columnNames = Seq("col1","col2",....."coln")
    
    // using the string column names:
    val result = dataframe.select(columnNames.head, columnNames.tail: _*)
    
    // or, equivalently, using Column objects:
    val result = dataframe.select(columnNames.map(c => col(c)): _*)