Search code examples
azureazure-synapse

Pyspark+Azure synapse - Need to merge dataframe into azure synapse with delta table


I have a data frame and synapse sql table ,Need to merge the data frame into a synapse table with a delta location . I followed different documents but still not able to merge in synapse

Could you help me with this ?


Solution

  • Yes, you can perform a merge operation in the azure synapse with Pyspark.

    Note: If the columns in your table are different from the columns that you have inserted in your data frame then you will get an error. make sure it should be same performing merge operation.

    Please follow the below code:

    This is my sample synapse table Employees and stored in delta location.

    Before merge operation

    enter image description here

    After that, I created the same schema of another table as a data frame.

    from pyspark.sql import types as f
    df12 = spark.createDataFrame(
        [
            ("vam",21000,4004),  
            ("ba1",21111,4567)
        ],
        f.StructType(  
            [
                f.StructField("Employee", f.StringType(), True),
                f.StructField("Salary", f.IntegerType(), True),
                f.StructField("CurrentRecord", f.IntegerType(), True)
            ]
        ),
    )
    

    enter image description here

    Then, Using the below code we can perform a Merge operation .

    %%pyspark
    
    df12.write.option("mergeSchema","true").format("delta").mode("append").saveAsTable("Employees")
    

    enter image description here

    Output:

    After merge operation

    enter image description here