When I attempt to stream data from Azure Event Hub with the following query I get the error:
Stream stopped...
java.lang.IllegalArgumentException: failed to parse 1
My code is as follows:
streamingQuery = (
df
.writeStream
.format("delta")
.outputMode("append")
.option("checkpointLocation", f"{location}/_checkpoints")
.start(location)
The full error is as follows:
1643393895301Expected e.g. {"ehName":{"0":23,"1":-1},"ehNameB":{"0":-2}}
at org.apache.spark.sql.eventhubs.JsonUtils$.partitionSeqNos(JsonUtils.scala:98)
at org.apache.spark.sql.eventhubs.EventHubsSourceOffset$.apply(EventHubsSourceOffset.scala:61)
at org.apache.spark.sql.eventhubs.EventHubsSource$$anon$1.deserialize(EventHubsSource.scala:139)
at org.apache.spark.sql.eventhubs.EventHubsSource$$anon$1.deserialize(EventHubsSource.scala:115)
at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.readBatchFile(HDFSMetadataLog.scala:242)
at org.apache.spark.sql.execution.streaming.HDFSMetadataLog.get(HDFSMetadataLog.scala:232)
at org.apache.spark.sql.eventhubs.EventHubsSource.initialPartitionSeqNos$lzycompute(EventHubsSource.scala:170)
at org.apache.spark.sql.eventhubs.EventHubsSource.initialPartitionSeqNos(EventHubsSource.scala:113)
at org.apache.spark.sql.eventhubs.EventHubsSource.getBatch(EventHubsSource.scala:316)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$3(MicroBatchExecution.scala:492)
at scala.collection.TraversableLike.$anonfun$flatMap$1(TraversableLike.scala:245)
Does anyone have any thoughts on where I might be going wrong?
@Pattterson In my case the issue was resolved by changing the checkpoint location, because I was using the previous location to run my tests.