I am trying set checkpointing for spark streaming application to Azure storage. I was using S3 and the code was working fine.
Here is the latest code of how I set checkpointing to Azure.
sc.hadoopConfiguration
.set("fs.azure", "org.apache.hadoop.fs.azure.NativeAzureFileSystem")
sc.hadoopConfiguration
.set(
"fs.azure.account.key.[name].blob.core.windows.net",
[key]
)
ssc.checkpoint(
"https://[name].blob.core.windows.net/[blob]")
Here is the error message that I am getting when starting. Exception in thread "main" java.io.IOException: No FileSystem for scheme: https
See here - it's for databricks but should still apply.
val df = spark.read.parquet("wasbs://<container-name>@<storage-account-name>.blob.core.windows.net/<directory-name>")
==> So, use wasbs
instead of https