I want to save my Spark DataFrame into directory using spark_write_*
function like this:
spark_write_csv(df, "file:///home/me/dir/")
but if the directory is already there I will get error:
ERROR: org.apache.spark.sql.AnalysisException: path file:/home/me/dir/ already exists.;
When I'm working on the same data, I want to overwrite this dir - how can I achieve this? In documentation there is one parameter:
mode Specifies the behavior when data or table already exists.
but it doesn't say what value you should use.
Parameter mode
should simply have value "overwrite"
:
spark_write_csv(df, "file:///home/me/dir/", mode = "overwrite")