Search code examples
sql-serverapache-spark-sqlpysparkaws-glue-spark

Delete records from table before writing dataframe - pyspark


I'm trying to delete records from my table before writing data into it from dataframe. Its not working for me ... What am I doing wrong?

Goal: "delete from xx_files_tbl" before writing new dataframe to table.
 
query = "(delete from xx_files_tbl)"
spark.write.format("jdbc")\
            .option("url", "jdbc:sqlserver://"+server+":1433;databaseName="+db_name)\
            .option("driver", driver_name)\
            .option("dbtable", query)\
            .option("user", user)\
            .option("password", password)\
            .option("truncate", "true")\
            .save()

Thanks.


Solution

  • Instead of deleting the data in sql server table before writing your dataframe, you can directly write your dataframe with .mode("overwrite") and .option("truncate",true).

    https://learn.microsoft.com/en-us/sql/big-data-cluster/spark-mssql-connector?view=sql-server-ver15