Search code examples
azureazure-cosmosdbazure-data-factory

Is there any Azure Data factory activity to get cosmos collection list?


Is there any option in Azure Data Factory to know the list of collections available in a particular database in Azure cosmos account?

Want to execute the particular activity - which returns list of collections in a cosmos db, every time when the Azure Data Factory pipeline get executed.

Exact requirement: Want to do copy data from all the collections from cosmos db but the list of collections in the cosmos db may vary as the time progress. If any new collection has been added to the cosmos db, dynamically that new collection needs to be considered as part of the copy activity without any external intervention.

While using Azure function activity in Azure Data factory to query the list of cosmos collection, it returns output like the below,

{
"Response": "[{\"source\": 
 {\"collectionName\":\"model1\"},\"destination\": 
 {\"fileName\":\"model1.txt\"}},{\"source\": 
{\"collectionName\":\"model2\"},\"destination\": 
{\"fileName\":\"model2.txt\"}}]"
}

Expected output

{
"Response": [
    {
     source:{collectionName:model1},
     destination: {fileName:model1.txt}
    },
  {
  source:{collectionName:model2},
  destination:{fileName:model2.txt}}]
  }
 ]
}

why Azure function activity returns array object as string?

p.s: when I run azure function separately from V22017 or azure portal it returns array object as array


Solution

  • I am not sure if a direct way exists but for a workaround you can

    1. Make a HTTP call to this API Get List of Collections. Making a REST API call will ensure you always have latest data.
    1. Parse the JSON response to get the collections. Followed by a For-Each Activity perhaps.

    Edit- Adding discussion in comments to the answer

    1. Make a Http Trigger function that can be called by a Web Activity.
    2. Obtain AAD token inside that function
    3. Call the REST Api to get list of cosmos collections
    4. Return json-array as response
    5. Add a For-each activity in ADF to loop over the list of collections.