I am trying to access event hub data by running a spark streaming job locally.
I faced an issue in setting the event hub configuration for eventhubs.checkpoint.dir
. I tried setting below value
wasbs://container_name@storage_name.blob.core.windows.net/
https://container_name@storage_name.blob.core.windows.net/
https://storage_name.blob.core.windows.net/continer_name/
Each resulted in similar errors as the following one:
ERROR ReceiverTracker: Deregistered receiver for stream 0: Restarting receiver with delay 2000ms: Error handling message; restarting receiver - java.io.IOException: No FileSystem for scheme: https
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2421)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2428)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:88)
at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2467)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2449)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:367)
You can set eventhubs.checkpoint.dir to a string value that would be a valid wasb folder name. For instance, I set it to "/myeventhubspark". The folder will be automatically created in the default container of your Spark cluster. Be sure to prepend the folder name with a forward-slash , like this -
"eventhubs.checkpoint.dir" -> "/myeventhubspark"