i would like to be able to overwrite my output path with parquet format, but it's not among available actions (append, complete, update), Is there another solution here ?
val streamDF = sparkSession.readStream.schema(schema).option("header","true").parquet(rawData)
val query = streamDF.writeStream.outputMode("overwrite").format("parquet").option("checkpointLocation",checkpoint).start(target)
query.awaitTermination()
Apache Spark only support Append
mode for File Sink
. Check out here
You need to write code to delete path/folder/files from file system
before writing a data.
Check out this stackoverflow link for ForeachWriter
. This will help you to achieve your case.