I have a Dataframe, from which a create a temporary view in order to run sql queries. After a couple of sql queries, I'd like to convert the output of sql query to a new Dataframe. The reason I want data back in Dataframe is so that I can save it to blob storage.
So, the question is: what is the proper way to convert sql query output to Dataframe?
Here's the code I have so far:
%scala
//read data from Azure blob
...
var df = spark.read.parquet(some_path)
// create temp view
df.createOrReplaceTempView("data_sample")
%sql
//have some sqlqueries, the one below is just an example
SELECT
date,
count(*) as cnt
FROM
data_sample
GROUP BY
date
//Now I want to have a dataframe that has the above sql output. How to do that?
Preferably the code would be in python or scala.
var df = spark.sql(s"""
SELECT
date,
count(*) as cnt
FROM
data_sample
GROUP BY
date
""")
df = spark.sql(f'''
SELECT
date,
count(*) as cnt
FROM
data_sample
GROUP BY
date
''')