Search code examples
muledataweavemulesoftmule4

Dataweave transfromation from nested JSON to CSV format


I have huge nested JSON that is being fed as a n input to the API. The input JSON looks like:

[{
  "ID": "123456",
  "accountNumber": "876545",
  "contractType": "ABCD",
  "linedata": 
    {
      "productSKU": "1234",
      "data": {
        "Line_Number": 1726428,
        "Reservation_Number": "1726428",
        "Quantity": 1,
        "UnitPrice": 16,
      }
    }
  },
  {
  "ID": "345678",
  "accountNumber": "445566",
  "contractType": "ABCD",
  "linedata": 
    {
      "productSKU": "2345",
      "data": {
        "Line_Number": 3213445565,
        "Reservation_Number": "123456789",
        "Quantity": 1,
        "UnitPrice": 25,
      }
    }
  }] 

I need to convert this into a CSV flat fromat retaining the JSON elements for linedata attribute which looks like:

Data, ID, AccountNumber, Contract_Type
{"productSKU":"1234","data":{"Line_Number":1726428, "Reservation_Number":"1726428", "Quantity":1,  "UnitPrice":15}}, 123456, 876545, ABCD
{"productSKU":"1234","data":{"Line_Number":1726428, "Reservation_Number":"1726428", "Quantity":1,  "UnitPrice":15}}, 345678, 445566, ABCD

The problem is that when I try to concatenate the linedata attribute as string, I am left with backslashes in the output. My datweave code looks like

inputJSON map ((item) -> {
  Data: write(item.linedata, "application/json"),
  ID: item.ID,
  AccountNumber: item.accountNumber,
  Contract_Type: item.contractType
})

and the output looks like:

Data,ID,AccountNumber,Contract_Type
{\
  \"productSKU\": \"1234\"\,\
  \"data\": {\
    \"Line_Number\": 1726428\,\
    \"Reservation_Number\": \"1726428\"\,\
    \"Quantity\": 1\,\
    \"UnitPrice\": 16\
  }\
},123456,876545,ABCD
{\
  \"productSKU\": \"2345\"\,\
  \"data\": {\
    \"Line_Number\": 3213445565\,\
    \"Reservation_Number\": \"123456789\"\,\
    \"Quantity\": 1\,\
    \"UnitPrice\": 25\
  }\
},345678,445566,ABCD

Seems like I am missing something basic here. What is the best way to achieve this ?

I further updated my dataweave function to be

%dw 2.0
output application/csv quoteValues=true, escape=''

var inputJSON = payload

// First step: Create manually concatenated JSON strings for 'linedata'
var jsonStringArray = inputJSON map ((item) -> {
  Data: "{\"productSKU\":\"" ++ item.linedata.productSKU ++ "\",\"data\":{\"Line_Number\":" ++ item.linedata.data.Line_Number as String ++ ",\"Reservation_Number\":\"" ++ item.linedata.data.Reservation_Number ++ "\",\"Quantity\":" ++ item.linedata.data.Quantity as String ++ ",\"UnitPrice\":" ++ item.linedata.data.UnitPrice as String ++ "}}",
  ID: item.ID,
  AccountNumber: item.accountNumber,
  Contract_Type: item.contractType
})

// Second step: Format the CSV output and remove backslashes
---
jsonStringArray map ((item) -> {
  Data: item.Data,
  ID: item.ID,
  AccountNumber: item.AccountNumber,
  Contract_Type: item.Contract_Type
})

this removes the backslashes but my csv doesn't understand that I intended only 4 columns and breaks it at the first comma inside the Data Column right after "productSKU":"1234"

The dataweave output looks like -

Data,ID,AccountNumber,Contract_Type
"{"productSKU":"1234","data":{"Line_Number":1726428,"Reservation_Number":"1726428","Quantity":1,"UnitPrice":16}}","123456","876545","ABCD"
"{"productSKU":"2345","data":{"Line_Number":3213445565,"Reservation_Number":"123456789","Quantity":1,"UnitPrice":25}}","345678","445566","ABCD"

Solution

  • The problem is that the linedata field contains characters that by default are used in the CSV format as field separator (comma) and quoting (quotes). They need to be escaped in the output.

    One alternative is to change those characters. Since the example doesn't use quotes for delimiting strings it may be ok but changing the separator character for example to a pipe (|) will change the expected output.

    The issue of having newlines and indenting can be resolved by using the indent output property configuration on write().

    %dw 2.0
    output csv quote="'", separator="|" 
    ---
    payload map {
        Data: write($.linedata, "application/json", {indent:false}),
        ID: $.ID,
        AccountNumber: $.accountNumber,
        Contract_Type: $.contractType
    }
    

    Output:

    Data|ID|AccountNumber|Contract_Type
    {"productSKU": "1234","data": {"Line_Number": 1726428,"Reservation_Number": "1726428","Quantity": 1,"UnitPrice": 16}}|123456|876545|ABCD
    {"productSKU": "2345","data": {"Line_Number": 3213445565,"Reservation_Number": "123456789","Quantity": 1,"UnitPrice": 25}}|345678|445566|ABCD
    

    You can change only one of the properties or both.