Search code examples
azure-data-factoryazure-cosmosdb-sqlapiazure-data-lake-gen2

stuck with basics of ADF pipeline


I have just started with ADF development, and I am stuck with the following requirement.

  • source - cosmos db collection
  • target - azure data lake gen2 storage
  • requirement - 1. read a cosmos collection 2. FOR EACH RECORD in the query output, create separate output files in data lake gen 2 storage 3. from same query output, create ANOTHER SINGLE FILE with fields separated by pipe character(here I need to transform the JSON query output to a tabular pipe delimited data structure)
  • the place where I am stuck - to read from cosmos collection, I have used a copy data activity, however I do not understand what to put in the sink of this activity in order to pass the query output to the next activity(for-each or another copy activity)

Solution

  • You can use the combination of Lookup activity to get the collection, CopyActivity to write files to ADLS.

    In the below example, I have used a field in the record a parameter to iterate over each record i.e @string(item().CustomerID), but you can use as per your collection. And modify the query as per your need.

    (requirement no. 1)

    CosmosDB source query:

    @concat('select * from c where c.CustomerID = ',variables('ids'))
    

    Then make sure you have the right files extension mentioned in the sink properties and corresponding dataset properties as well. I have used .tsv with pipe(|) delimited and another is a .csv file. (as per your requirement no. 2 & 3)

    enter image description here