I have an output spark Dataframe which needs to be written to CSV. A column in the Dataframe is 'struct' type and is not supported by csv. I am trying to convert it to string or convert to pandas DF but nothing works.
userRecs1=userRecs.withColumn("recommendations", explode(userRecs.recommendations))
#userRecs1.write.csv('/user-home/libraries/Sampled_data/datasets/rec_per_user.csv')
Expected result: Recommendations column as string type so that it can be split into two separate columns and write to csv.
Actual results: (recommendations column is struct type and cannot be written to csv)
ID_CTE| recommendations|
+-------+-----------------+
|3974081| [2229,0.8915096]|
|3974081| [2224,0.8593609]|
|3974081| [2295,0.8577902]|
|3974081|[2248,0.29922757]|
|3974081|[2299,0.28952467]|
The following command will flatten your StructType
into separate named columns:
userRecs1 \
.select('ID_CTE', 'recommendations.*') \
.write.csv('/user-home/libraries/Sampled_data/datasets/rec_per_user.csv')