I'm using Azure Databricks and I want a dataframe to be written to azure blob storage container. This is my current code.
spark.conf.set("fs.azure.account.key.sarcscdataplatform.dfs.core.windows.net", "<storage-account-key>")
source_table = "dbfs:/user/hive/warehouse/fan_enhanced"
destination_path = "abfss://gold-container@sarcscdataplatform.dfs.core.windows.net/output.csv"
dbutils.fs.cp(source_table, destination_path, recurse=True)
It creates the file but it is always empty while there is data in the dataframe. I look forward to everyones answers and thanks in advance!
You can try something like this:
spark.conf.set("fs.azure.account.key.sarcscdataplatform.dfs.core.windows.net", "<storage-account-key>")
output_container_path = ""abfss://gold-container@sarcscdataplatform.dfs.core.windows.net" % (output_container_name, storage_name)
output_blob_folder = "%s/data_folder" % output_container_path
# write the dataframe as a single file to blob storage
(dataframe
.coalesce(1)
.write
.mode("overwrite")
.option("header", "true")
.format("com.databricks.spark.csv")
.save(output_blob_folder))