Search code examples
azureazure-data-factoryazure-synapse

Azure Synapse Data flow, how to drop a field within an array


Looking to drop a field from an array of objects in synapse data flows.

Example data:

"Animals":[{"name":"tiger","colour":"white","location":"zoo"},{"name":"eagle","colour":"brown","location":"wild"}]

Expected Output:

"Animals":[{"name":"tiger","location":"zoo"},{"name":"eagle","location":"wild"}]

Looking for a way to drop the field. Thanks very much!


Solution

    • You can use a dataflow and copy activity to achieve your requirement. I have used a flatten transformation with following configuration:

    enter image description here

    • This would give some resulting data which I am writing to dataflow activity output using sink cache.

    enter image description here

    • In the pipeline, I have taken a copy data activity after the dataflow activity with source file as follows (1 row and 1 column which will be ignored anyway).

    enter image description here

    • Now, add an additional column as shown below:
    {"Animals":@{activity('Data flow1').output.runStatus.output.sink1.Value}}
    

    enter image description here

    • Now, choose sink file as delimited file with following configurations:

    enter image description here

    • In the mapping, keep only the above created additional column and delete the rest.

    enter image description here

    • When you run the pipeline, you would get the data as required. You can create a JSON dataset and use this file as a JSON file.

    enter image description here