As per Structured Streaming Programming Guide
queryName("myTableName")
is used to defined the in-memory table name when the output sink is format("memory")
aggDF
.writeStream
.queryName("aggregates") // this query name will be the table name
.outputMode("complete")
.format("memory")
.start()
spark.sql("select * from aggregates").show() // interactively query in-memory table
Spark source code for DataStreamWriterscala documents queryName()
as:
Specifies the name of the [[StreamingQuery]] that can be started with
start()
. This name must be unique among all the currently active queries in the associated SQLContext.
QUESTION: is there any other possible usages of the queryName()
setting? Spark job logs? details in progress monitoring of the query ?
I came across the following three usages of the queryName
:
As mentioned by OP and documented in the Structured Streaming Guide it is used to define the in-memory table name when the output sink is of format "memory".
The queryName defines the value of event.progress.name
where the event is a QueryProgressEvent
within the StreamingQueryListener
.
It is also used in the description column of the Spark Web UI (see screenshot where I set queryName("StackoverflowTest")
: