Search code examples
azure-cosmosdbazure-data-factory

How to use ADF map() function


I need to perform a data transformation in ADF and I'm having difficulty understanding how to use the map() function in DataFlow.

In the input container, I have items with a format similar to the example below:

{
    "companyName": "ExampleName",
    "departmentManager": [
        {
            "supervisor": "Jhon",
            "department": "Marketing",
            "salary": "$135000"
        },
        {
            "supervisor": "Emily",
            "department": "Human Resources",
            "salary": "$135000"
        }
    ],
    "id": "123456798"
}

I have a container in Cosmos that I retrieve several items and apply derived column and map() to transform the departmentManager column before sending it to another container. The output format will be as follows:

{
    "companyName": "ExampleName",
    "departmentManager": [
        {
            "supervisor": "Jhon",
            "department": "Marketing",
            "salary": 135000
        },
        {
            "supervisor": "Emily",
            "department": "Human Resources",
            "salary": 135000
        }
    ],
    "id": "123456798"
}

As you can see, I am changing the type from string to long. How can I achieve this goal using the map() function?

I was trying to use the following approach:

map(item -> 
    {
        "supervisor": item.supervisor,
        "department": item.department,
        "salary": toLong(replace(item.salary, "$", ""))
    })

But I am getting this error:

enter image description here

I was looking at the documentation, but the example is not very helpful. https://learn.microsoft.com/en-us/azure/data-factory/data-flow-expressions-usage#map


Solution

  • There are two different approaches I have tried converting the salary value to long.

    Approach1 using Copy Data/Lookup Activity of ADF and Cosmos DB query.

    Configuration for Copy Data/Lookup Activity enter image description here

    Cosmos DB Query

    SELECT f.companyName,ARRAY(
    select c.supervisor,
    c.department,
    udf.convertToInt(replace(c.salary,"$","")) as salary FROM c
    in f.departmentManager)as departmentManager, f.id from f
    

    Code for udf.convertToInt

    `function convertToInt(strNum) {return parseInt(strNum);}`;
    

    Result

    enter image description here

    Approach2 using dataflow Here is the flow I created enter image description here

    1. Source1 is creating source connection with Cosmos DB account.
    2. flatten1 using this activity I have flatten the departmentManager structure to tabular format using configuration for Unroll by and Unroll root.

    Configuration enter image description here Result-/Preview enter image description here

    1. derivedColumn1 added a column departmentManager with expression. Here I have written the expression to convert string to long conversion.

    Code for expression @(supervisor=departmentManager.supervisor, department=departmentManager.department, salary=toLong(replace(departmentManager.salary,"$","")))

    Configurationenter image description here

    Result-/Preview Here you can see datatype of salary is showing as long.enter image description here

    1. aggregate1 used this to convert flatten structure again to Json structure. Used companyName in for Group by and under Aggregates tab used collect(departmentManager) expression for departmentManager column.

    Configuration

    enter image description here

    enter image description here Result-/Preview enter image description here