I am new to autoLoader and trying to run below autoLoader code in notebook.
spark.readStream.format("couldFiles")\
.option("cloudFiles.format","csv")\
.load("dbfs:/FileStore/tables/test*.csv") \
.writeStream
But got below error.
java.lang.ClassNotFoundException: Failed to find data source: couldFiles. Please find packages at http://spark.apache.org/third-party-projects.html
Can anyone please help advice?
java.lang.ClassNotFoundException: Failed to find data source: couldFiles.
The above error happening because of cloudFiles
.Configure cloudFiles accordingly as shown in the below code:
cloudFiles ={
"cloudFiles.subscriptionId" :"<subscription_Id>",
"cloudFiles.connectionString" :"<connectionString_Storage_account>",
"cloudFiles.format":"csv",
"cloudFiles.tenantId":"<tenantId>",
"cloudFiles.clientId":"<client_ID>",
"cloudFiles.clientSecret":"<Client_Secret>",
"cloudFiles.resourceGroup":"<Resource_group_name>",
"cloudFiles.useNotifications":"yes"
}
For more information Configuring Auto Loader in Azure Databricks follow this link, it has a detailed explanation about read and write streaming data on the Azure Databricks.