Search code examples
dataframeapache-sparkpysparkdatabricks

How can I save a spark DF as a CSV file?


I have some Python code that loops through files and cretes a dataframe (DF). Also, I am converting the Python DF to a Spark DF. This works fine.

# convert python df to spark df and export the spark df
spark_df = spark.createDataFrame(DF)

Now, I am trying to save the Spark DF as a CSV file.

## Write Frame out as Table
spark_df.write.mode("overwrite").save("dbfs:/rawdata/AAA.csv")

The code directly above runs, but it doesn't create the CSV, or at least I can't find it where I would expect it to be. Is there a way to do this?


Solution

  • Spark takes path of output directory instead of output file while writing dataframe so the path that you have provided "dbfs:/rawdata/AAA.csv" will create directory AAA.csv not a file. You need to check for directory instead of file. In directory you will get multiple csv file based on your number of executors.