I am trying to ingest data into my azure data explorer table from my cosmos db for mongo db collection. I was able to do this using azure data factory where I can create a pipeline that fetches all records every x hours.
The problem is when it fetches these records it will insert them in data explorer table, regardless if they already exist or not. Also if a record was deleted from my mongo db collection, this record will never be deleted from my azure data explorer table.
I tried to create a data flow using data explorer as sink (It has an option to recreate the table again before copying these records which would solve the problem, because the table would be deleted and created again before copying the records) but cosmos db for mongo db is not supported as a source.
Any idea How I can maintain a table in azure data explorer that fetches data from a mongo db collection every x hours? (without having duplicate records)
Since your requirement is to copy the data without having duplicate records, you can use Azure Data explorer command
activity to clear the data in the Kusto table and then use the copy activity to copy the data from Azure cosmos db (mongo db) to Kusto database.
Azure Data Explorer command
activity and give the command as,.clear table <table-name> data;
Replace the <table-name>
with the actual sink table name.
By this way, data can be copied without any duplicates.