I have a pyspark data frame that every column appends the table name ie: Table.col1
, Table.col2
...
I would like to replace 'Table.'
with ''
(nothing) in every column in my dataframe.
How do I do this? Everything I have found deals with doing this to the values in the columns and not the column names themselves.
One option is to use toDF
with replace
:
DataFrame.toDF(*cols)
Returns a new DataFrame that with new specified column names
out = df.toDF(*[c.replace("Table.", "") for c in df.columns])
Output :
out.show()
+----+----+
|col1|col2|
+----+----+
| foo| 1|
| bar| 2|
+----+----+
Input used :
+----------+----------+
|Table.col1|Table.col2|
+----------+----------+
| foo| 1|
| bar| 2|
+----------+----------+