I'm very new to Spark and Python. I'm trying to see any metric in Spark Structured Streaming (for example, processedRowsPerSecond
), but I don't know how to do it.
I've read in "Structured Streaming Programming Guide" that with print(query.lastProgress)
you can directly get the current status and metrics of an active query, but if I write it I only obtain None
once. The last part of my code is the following:
query = windowedCountsDF\
.writeStream\
.outputMode('update')\
.option("truncate", "false") \
.format('console') \
.queryName("numbers") \
.start()
print(query.lastProgress)
query.awaitTermination()
Any idea on how to do it will be highly appreciated.
Try with:
while query.isActive:
print("\n")
print(query.status)
print(query.lastProgress)
time.sleep(30)
query.awaitTermination()
blocks query.lastProgress
.