Search code examples
azure-data-factory

can we move data from cosmos to blob in ADF as separate files?


Update:

Based on suggestion I could save the files and delete but unable to find a way to store the data from cosmos to Azure storage as separate files.

The files are getting stored as a big file with multiple JSON . What I did was checked preserve Hierarchy: Sink

Updated Question :

Can we store files as separate files in azure storage folder with file name as ID (received from cosmos).

Option- I can probably use Foreach and read 1 file at time but I assume it will take time and will cost more RU.

Any suggestion ?

Thanks

=======================================================

I am trying to archive data from Cosmos to Blob using ADF

  • Tried:

I tried using Copy Activity as well as dataFlow. I was able to copy data and retrieve back as well but I am not getting option to move files. What I am getting is copy option.

  • Question 1:

Can we move files(not just copy) via ADF and not loose files in case of failure ?

  • Question 2 :

Can we move files separately ? I do not want all cosmos files to be dumped into single Blob file as it will become difficult to retrieve them in case needed ?


Solution

  • ADF doesn't support move operation. In order to perform move operation, you can use copy activity and copy data from source cosmos dB to sink blob storage. And then delete the source data using rest API in web activity. This way, you can make sure that data will not be lost in case of pipeline failure also.

    • This is the overall pipeline to move the data from cosmos Db to sink blob.
    • In copy activity, take the source as cosmos dB and sink as azure blob storage.
    • Then take the web activity and connect it sequentially (next to) copy activity.
    • In web activity, give the Request URL as in below image and method as delete.

    Request URL format: https://{databaseaccount}.documents.azure.com/dbs/{db-id}/colls/{coll-id}/docs/{doc-id}

    Refer the Microsoft document on Deleting the cosmos DB document using REST API