I have a DataFrame which have different type of columns. Among those column, i need to retrieve specific column from that DataFrame. Hard coded DataFrame select statement will be like this:
val logRegrDF = myDF.select(myDF("LEBEL_COLUMN").as("label"),
col("FEATURE_COL1"), col("FEATURE_COL2"), col("FEATURE_COL3"), col("FEATURE_COL4"))
Where LEBEL_COLUMN and FEATURE_COLs will be dynamic. I have Array or Seq for those FEATURE Columns like this:
val FEATURE_COL_ARR = Array("FEATURE_COL1","FEATURE_COL2","FEATURE_COL3","FEATURE_COL4")
I need to use this Array of column collection with that SELECT statement in the 2nd part. In the select, 1st column will be one (LABEL_COLUMN) and rest will be dynamic list.
Can you please help me to make the select statement working in SCALA.
Note: The sample code given bellow is working, but i need to add column array in the 2nd part of the SELECT
val colNames = FEATURE_COL_ARR.map(name => col(name))
val logRegrDF = myDF.select(colNames:_*) // it is not the requirement
I am thinking for 2nd part code will be like this, but it is not working:
val logRegrDF = myDF.select(myDF("LEBEL_COLUMN").as("label"), colNames:_*)
If I understand your question, I hope this is what you are looking for
val allColumnsArr = "LEBEL_COLUMN" +: FEATURE_COL_ARR
result.select("LEBEL_COLUMN", allColumnsArr: _*)
.withColumnRenamed("LEBEL_COLUMN", "label")
Hope this helps!