I have an ADF pipeline to copy data from ADLS to Data Explorer(ADX).
I am reading all the data and able to copy to ADX.
Problem is that every time my pipeline runs and ingest data the ADX table gets duplicate data in table.
The source folder containing log files which are logs of a copy activity triggered every two hours with new files. And the logs contains new copied file details.
So, how do I fix the issue of duplication, is there a way to upsert ?
As Nikolai suggested in his third point- Clear the destination table prior to export where you delete all records from the table and then export your latest batch. This is not really a great option but could be a quick and dirty solution if your source stores all the data that you need.
And since it is a quick fix for me, I used same and achieved using below.
I have used a Azure Data Explorer Command
activity for this followed by a Copy
Activity. It works like a charm.
We can run definition query as well as control queries using ADX Command
activity.