Search code examples
apache-sparkapache-spark-dataset

spark createOrReplaceTempView vs createGlobalTempView


Spark Dataset 2.0 provides two functions createOrReplaceTempView and createGlobalTempView. I am not able to understand the basic difference between both functions.

According to API documents:

createOrReplaceTempView: The lifetime of this temporary view is tied to the [[SparkSession]] that was used to create this Dataset.
So, when I call sparkSession.close() the defined will be destroyed. is it true?

createGlobalTempView: The lifetime of this temporary view is tied to this Spark application.

when this type of view will be destroyed? any example. like sparkSession.close()?


Solution

  • df.createOrReplaceTempView("tempViewName")
    df.createGlobalTempView("tempViewName")
    

    createOrReplaceTempView() creates or replaces a local temporary view with this dataframe df. Lifetime of this view is dependent to SparkSession class, is you want to drop this view :

    spark.catalog.dropTempView("tempViewName")
    

    or stop() will shutdown the session

    self.ss = SparkSession(sc)
    ...
    self.ss.stop()
    

    createGlobalTempView() creates a global temporary view with this dataframe df. life time of this view is dependent to spark application itself. If you want to drop :

    spark.catalog.dropGlobalTempView("tempViewName")
    

    or stop() will shutdown

    ss =  SparkContext(conf=conf, ......)
    ...
    ss.stop()