Search code examples
azure-data-factory

How to create json object from set of items


In Azure Data Factory, I have a Pipeline (made of flowlets, but that's a technicality) that is more or less the following flow:

  1. Get a set of items from a Data Set (let's say : I get 5 cars, each car has its "columns" -- id, color, model, ...)

  2. turn that set into an array : I do it with an Aggregate block which contains a "collect" script function.

What I want :

I would like step 2 to create an object, not an array.

If this was json, this is what I would like:

//NOT THIS
[
 { "id":"1", "model":"", "color":"red" }, 
 { "id":"2", "model":"", "color":"blue" }, 
]


//THIS
{
 "1": { "model":"", "color":"red" }, 
 "2": { "model":"", "color":"blue" }, 
}

I've tried working with the objects as strings and then using a bunch of "replace" to turn [ ] into { } ... but that's just too much grinding -- and more importantly there's too high a risk that I make a mistake with character escape.

How would you turn a set of items into one object instead of an array?

Note: the end goal is to later be able to work with the cars as a dictionary rather than a collection, in programming terms. I just added this for anyone who might be googling this without knowing exactly what they're looking for.


Solution

  • The other answer says that the question is nonsensical. But it's not entirely true.

    If your goal is really to create a dictionary, which is to say that you really want to have one column per Id, i.e. one output column per input row then you can investigate the following Azure Data Factory field : "rows to columns", aka pivot. for example : Rows to Columns in ADF

    Pay close attention to the explanations on "Allow schema drift" to manage the evolution of the schema as the columns get transformed.

    Note : of course you want to be sure that there will be a reasonably low number of input Ids -- you don't want to create one giant dictionary. I don't know how ADF would react to that, performance-wise.

    Note: In this answer I'm discussing only the Ids transformation, but of course there's still the matter of turning the other fields/columns into the dictionary object meant to match the key. That would be done the same way as in the other answer (look at the part dealing with "ALL OTHER FIELDS").