Search code examples
pythonapache-sparkpysparkdatabricksparquet

Difference between PySpark functions write.parquet vs write.format('parquet')


In PySpark DataFrames can be saved in two ways, irrespective of the data it contains

df.write.format('parquet').save(<path>)

and

df.write.parquet(<path>)

What is the difference between these two functions?


Solution

  • Open the implementation of parquet("path") and you will see that it just calls format("parquet").save("path").