Search code examples
pysparkazure-synapseazure-synapse-analyticsazure-notebooks

What is the usage of createGlobalTempView or createOrReplaceGlobalTempView in Synapse notebook?


We know the Spark pool in Synapse will not work like databricks cluster model. We make use of GlobalTempViews in Databricks where they can be attached to cluster and other notebook can access the GlobalTempViews that are defined . As they are active for a session, however if we use the same spark pool to 2 new notebook by limiting the number of executors to use per notehook, its will not take the GlobalTempViews.

Access GlobalTempView in other notebook in synapse


Solution

  • Currently, GlobalTempViews are not shared across different Spark sessions or notebooks. When you use the same Spark pool to two new notebooks by limiting the number of executors to use per notebook, it will not take the GlobalTempViews.

    The workaround can be:

    Using the command listed below, we may call one notebook from another.

    mssparkutils.notebook.run("notebook path", <timeoutSeconds>, <parameters>)
    

    Global Temporary views have a scope in the calling notebook's spark session. As the views are produced in the same spark session, when we call the notebook, it will run in the spark session of the calling notebook, and we may access the global temporary views that are created in the callee.

    Code in Notebook2:

    from notebookutils import mssparkutils
    returned_GBview = mssparkutils.notebook.run("/Notebook 1")
    df2=spark.sql("select * from {0}".format(returned_GBview))
    df2.show()
    

    You can see that I am able to show my sample data which was read in Notebook1 using Global temporary view.

    enter image description here

    Notebbok1 code:

    df.createOrReplaceGlobalTempView("globleview")
    mssparkutils.notebook.exit("globleview")
    

    We are getting the dataframe by returning the name of the Global temporary view in exit function.

    enter image description here