Search code examples
dataframepysparkdatabricksdatabricks-community-edition

Passing DataFrame from notebook to another with pyspark


i'am trying to call a DataFrame that i created in notebook1 to use it in my notebook2 in Databricks Community addition with pyspark and i tried this code dbutils.notebook.run("notebook1", 60, {"dfnumber2"}) but it shows this error. py4j.Py4JException: Method _run([class java.lang.String, class java.lang.Integer, class java.util.HashSet, null, class java.lang.String]) does not exist

any help please?


Solution

  • The actual problem is that you pass last parameter ({"dfnumber2"}) incorrectly - with this syntax it's a set, not the map type. You need to use syntax: {"table_name": "dfnumber2"} to represent it as a dict/map.

    But if you look into documentation of dbutils.notebook.run, you will see following phrase:

    To implement notebook workflows, use the dbutils.notebook.* methods. Unlike %run, the dbutils.notebook.run() method starts a new job to run the notebook.

    But jobs aren't supported on the Community Edition, so it won't work anyway.