Search code examples
databricksazure-databricksaws-databricks

Databricks Magic Sql - Export Data


Is it possible to export the output of a "magic SQL" command cell in Databricks?

I like the fact that one doesn't have to escape the SQL command and it can be easily formatted. But, I cant seem to be able to use the output in other cells. What I would like to do is export the data to a CSV file, but potentially, finish some final manipulation of the dataframe before I write it out.

sql = "select * from calendar"
df = sqlContext.sql(sql)
display(df.limit(10))

vs (DBricks formatted the following code)

%sql
select
  *
from
  calendar

but imagine, once you bring in escaped strings, nested joins, etc. Wondering if there is a better way to work with SQL in databricks.


Solution

  • The simplest solution is the most obvious one that I didn't think of: create a view!

    %sql
    
    CREATE OR REPLACE TEMPORARY VIEW vwCalendar as
    /*
      Comments to make your future self happy!
     */
    select
      c.line1,   -- more comments
      c.line2,   -- more comments
      c.zipcode
    from
      calendar
    where
      c.status <> 'just an example\'s' -- <<imagine escaping this
    

    and now you can use the view vwCalendar in subsequent SQL cells just like any other table.

    and if you want to use it in a python cell:

    df = spark.table("vwCalendar")
    display(df.limit(3))
    

    https://docs.databricks.com/spark/latest/spark-sql/language-manual/sql-ref-syntax-ddl-create-view.html

    https://docs.databricks.com/spark/latest/spark-sql/udf-python.html#user-defined-functions---python