apache-spark-sql parquet azure-databricks

Azure Databricks - Write to parquet file using spark.sql with union and subqueries

Issue:

I'm trying to write to parquet file using spark.sql, however I encounter issues when having unions or subqueries. I know there's some syntax I can't seem to figure out.

Ex.

%python
    
df = spark.sql("SELECT
          sha2(Code, 256) as COUNTRY_SK,
          Code as COUNTRY_CODE,
          Name as COUNTRY_NAME,
          current_date() as EXTRACT_DATE
       
        FROM raw.EXTR_COUNTRY)
       
    UNION ALL
    
    SELECT
          -1 as COUNTRY_SK,
          'Unknown' as COUNTRY_CODE,
          'Unknown' as COUNTRY_NAME,
          current_date() as EXTRACT_DATE")
          
df.write.parquet("dbfs:/mnt/devstorage/landing/companyx/country", 
   mode="overwrite")

WHEN doing a simple query I have no issues at all, such as:

%python
    
df = spark.sql("select * from raw.EXTR_COUNTRY")
df.write.parquet("dbfs:/mnt/devstorage/landing/companyx/country/", 
   mode="overwrite")

Solution

The quotes solved the issue, the sql-script itself wasn't the issue. So using tripple quotes (""" or ''') solved the issue.

 %python
        
    df = spark.sql("""SELECT
              sha2(Code, 256) as COUNTRY_SK,
              Code as COUNTRY_CODE,
              Name as COUNTRY_NAME,
              current_date() as EXTRACT_DATE
           
            FROM raw.EXTR_COUNTRY)
           
        UNION ALL
        
        SELECT
              -1 as COUNTRY_SK,
              'Unknown' as COUNTRY_CODE,
              'Unknown' as COUNTRY_NAME,
              current_date() as EXTRACT_DATE""")
              
    df.write.parquet("dbfs:/mnt/devstorage/landing/companyx/country", 
       mode="overwrite")