Search code examples
azure-data-factory

How to read a CSV into array variable ignoring the auto-assigned headers ("Prop_#), preferably without FE loop


I have a process that read message IDs into an array and later saves them into a .csv file to preserve these IDs for later runs (I need to check incoming files to see if the same ID has already been reported).

Saving the array into the csv is not a problem, however when I read it back from the CSV into the array it comes with the sort of automatic header assigned by ADF seeing my dataset does not have headers:

Resulting CSV file You can see above the contents of the CSV file where the "clean IDs" came from the latest run, meaning they were just saved into the file, but the ones read from the file in the beginning have the "Prop_0: " in them. And this further compounds as the more times you read from the files, the records keep getting "Prop_0"s attached to them.

I wanted a way to just get the ID into the variable, preferably without resorting to an FE loop seeing it might impact the performance.

Currently I am loading them into the array like this: Loading the file via a lookup and a set variable activities

@activity('LookupLog').output.value

I have also tried a simple trick of casting the output.value into a string and using a replace function to get rid of the "Prop_0" part and reverting it back into an array but I was unsuccessful:

The replace attempt

Resulting csv from the replace attempt


Solution

  • I have also tried a simple trick of casting the output.value into a string and using a replace function to get rid of the "Prop_0" part and reverting it back into an array but I was unsuccessful:

    You can use the below expression to get the required array without auto-assigned headers.

    @skip(split(replace(replace(replace(replace(replace(replace(string(activity('Lookup1').output.value),'"',''),'[',''),']',''),'{',''),',',''),'}',''),'Prop_0:'),1)
    

    This is my sample Lookup output array:

     [
            {
                "Prop_0": "ID00034320240513001901125"
            },
            {
                "Prop_0": "ID12345678908763243744788"
            },
            {
                "Prop_0": "ID65811993776814561728290"
            },
            {
                "Prop_0": "ID24262426242624262426246"
            }
    ]
    

    Use the above expression a set variable activity of array type.

    enter image description here

    Required array:

    enter image description here

    You can store the above array into a csv file as per your requirement.