I have spark dataframe with whitespaces in some of column names, which has to be replaced with underscore.
I know a single column can be renamed using withColumnRenamed()
in sparkSQL, but to rename 'n' number of columns, this function has to chained 'n' times (to my knowledge).
To automate this, i have tried:
val old_names = df.columns() // contains array of old column names
val new_names = old_names.map { x =>
if(x.contains(" ") == true)
x.replaceAll("\\s","_")
else x
} // array of new column names with removed whitespace.
Now, how to replace df's header with new_names
var newDf = df
for(col <- df.columns){
newDf = newDf.withColumnRenamed(col,col.replaceAll("\\s", "_"))
}
You can encapsulate it in some method so it won't be too much pollution.