I have a pyspark dataframe that looks like this
"col1" "col2" "col3"
"value1" "value2" value3
"value4" "value5" value6
Want to save it as csv file. so I tried the following option
df.write.format('csv').option('delimitor',',').option("quote",'').save(path)
it is working fine for the data element, but for headers it is not working.
the output looks like this
"""col1""","""col2""","""col3"""
"value1","value2",value3
"value4","value5",value6
The output should be look like this
"col1","col2","col3"
"value1","value2",value3
"value4","value5",value6
In the header part extra double quotes are added. The data part looks fine.
Any suggestion what am I missing here. Tried quoteAll but didn't worked out.
You have a typo in your code, it should be option('delimiter
), not delimitor
. Also you can make it easier on yourself by using the header
option:
df.write.format('csv').option('delimiter', ',').option('quote', '').option('header', 'true').save(path)
When header
is set to 'true', the first row of the output file will contain the column names. When set to 'false', the column names will not be included in the output file.