Search code examples
pysparkdatabricksazure-databricksdelta-lake

Databricks 'DeltaMergeBuilder' object has no attribute 'whenNotMatchedBySourceDelete'


I am trying to use the WHEN NOT MATCHED BY SOURCE clause to UPDATE or DELETE records in the target table that do not have corresponding records in the source table with Delta Lake.

The following code example shows the basic syntax of using this for deletes, overwriting the target table with the contents of the source table and deleting unmatched records in the target table:

(targetDF
  .merge(sourceDF, "source.key = target.key")
  .whenMatchedUpdateAll()
  .whenNotMatchedInsertAll()
  .whenNotMatchedBySourceDelete()
  .execute()
)

I am trying to use the same principle with my merge code, but I'm getting the error:

'DeltaMergeBuilder' object has no attribute 'whenNotMatchedBySourceDelete'

The code is as follows:

deltadf.alias("t")
   .merge(
    partdf.alias("s"),
    "s.primary_key_hash = t.primary_key_hash")
  .whenMatchedUpdateAll("s.change_key_hash <> t.change_key_hash")
  .whenNotMatchedInsertAll()
  .whenNotMatchedBySourceDelete().
 execute()
)

Any thoughts on why I'm getting the error, and how to fix it?


Solution

  • You need to use at least DBR 12.1 where it was introduced. Better to switch to DBR 12.2 (latest LTS).

    Update 29th March 2023rd: upcoming Delta Lake 2.3.0 will also have this functionality