Search code examples
azure-blob-storageazure-data-factoryazure-cosmosdbunionpipeline

Inserting value stored in a pipeline variable as a new row of sink dataset data


In ADF, I generate pipeline variables such as the pipeline run time, rows copied count, copy status etc. from various activities. I then use Set Variable to create a JSON string consisting of the above values. Now I want to store this string into an existing CosmosDB as a new record.

I cannot use Azure Functions or a Notebook job to insert, as I want it to be possible within ADF without additional dependencies.

I have tried using Data Flow, but it is unable to add a new record from values in a pipeline variable. I cannot use a separate data source to union and add into the CosmosDB as the values I want inserted are generated during runtime.


Solution

  • You can use the individual pipeline variables like pipeline run time , copied rows count, copy status as they are (without creating a json document consisting of all values) and use the copy activity in adf to insert the values in cosmos db container. Below is the approach.

    • Three pipeline variables pipeline run time , copied rows count, copy status are taken and their values are set using set variable activity here.

    enter image description here

    • Then a copy activity is taken with a dummy source dataset. Source dataset has one row. In source tab of copy activity, New columns are added by clicking +New in additional columns. All three columns are set with the value that is defined using set variable activity.

    enter image description here

    • Then sink dataset is taken for cosmos dB with insert as write behavior.

    enter image description here

    • Then in Mapping section, Schema is imported by giving sample values for the pipeline variables. Then all additional columns that is created in source are mapped to sink columns.

    enter image description here

    • When pipeline is run, data is inserted to cosmos dB.