Search code examples
javajsontransformationdataweavemulesoft

Recursivly traverse and flatten JSON object in DataWeave


I want to traverse and flatten a big JSON file that has following structure showing a product hierarchie (think of it as navigation in an online shop):

productGroups: [
    {
        "key": "child 1"
        ...
        "childrenProductGroups": [
            {
                "key": "child 1.1",
                ...,
                "childrenProductGroups": []
            },
            {
                "key": "child 1.2"
                ...
                "childrenProductGroups": [
                    {
                        "key": "child 1.2.1",
                        ...,
                        "childrenProductGroups": [
                            {
                                "key": "child 1.2.1.1",
                                ...,
                                childrenProductGroups": [
                                    ...
                                ]
                            }
                        ]
                    },
                    {
                        "key": "child 1.2.2",
                        ...,
                        "childrenProductGroups": []
                    }
                ]
                
            },
            {
                "key": "child 1.3",
                ...,
                "childrenProductGroups": [
                    ...
                ]
            }
        ]
    }, 
    {
        "key": "child 2",
        ...,
        "childrenProductGroups": [
            ...
        ]
    },
    {
        "key": "child 3",
        ...,
        "childrenProductGroups": [
            ...
        ]
    }
]

And I want to flatten them in a format like this:

{
    "hierarchieSet": [
        {
            "Nodeid": "00000001", # Number in this json
            "Nodename": "child 1",
            "Tlevel": "01", # First child of product group
            "Parentid": "00000000", # Parent is null
            "Childid": "00000002", # Child node number
            "Nextid": "00000008" # Node number on the same level (child 2)
        }, 
        {
            "Nodeid": "00000002",
            "Nodename": "child 1.1",
            "Tlevel": "02",
            "Parentid": "00000001",
            "Childid": "00000003",
            "Nextid": "00000003"
        }, 
        {
            "Nodeid": "00000003",
            "Nodename": "child 1.2",
            "Tlevel": "02",
            "Parentid": "00000002",
            "Childid": "00000005",
            "Nextid": "00000007"
        }, 
        {
            "Nodeid": "00000004",
            "Nodename": "child 1.2.1",
            "Tlevel": "03",
            "Parentid": "00000003",
            "Childid": "0000005",
            "Nextid": "00000006"
        }
        , 
        {
            "Nodeid": "00000005",
            "Nodename": "child 1.2.1.1",
            "Tlevel": "04",
            "Parentid": "00000004",
            "Childid": "0000000", #No more children
            "Nextid": "00000000"
        }, 
        {
            "Nodeid": "00000006",
            "Nodename": "child 1.2.2",
            "Tlevel": "03",
            "Parentid": "00000003",
            "Childid": "0000000",
            "Nextid": "00000000"
        }, 
        {
            "Nodeid": "00000007",
            "Nodename": "child 1.3",
            "Tlevel": "02",
            "Parentid": "00000001",
            "Childid": "0000000",
            "Nextid": "00000000"
        }, 
        {
            "Nodeid": "00000008",
            "Nodename": "child 2",
            "Tlevel": "01",
            "Parentid": "00000000",
            "Childid": "0000009", # 00000009 not shown
            "Nextid": "00000014" # 
        }, 
        ...
        {
            "Nodeid": "000000014",
            "Nodename": "child 3",
            "Tlevel": "01",
            "Parentid": "00000000",
            "Childid": "00000015",
            "Nextid": "00000000" # 00000010 does not exist
        }
    ]
}

Thus I have identified some main concerns:

  • Recursion of the tree structure
  • Transforming the elements
  • Flattening the structure
  • Keeping track of parents, siblings and children
  • Keeping track of recursion level
  • Formatting numbers

I tried to solve this issue by 2 different approaches:

  • Use DataWeave to transform all elements
  • Use Java to traverse the structure

As I'm fairly new to functional programming I put more focus on the Java implementation but ran into a number of issues.

Java approach
Java Flow
Read json > Init Tree var and assign the Java instance > for each element in top-level array invoke traverse(data, level) in Tree.java.
Tree.java:

import org.json.JSONObject;

public class Tree {
    private int id = 0;
    private List<Node> nodes = new ArrayList<Node>();
    
    public Tree() {
        nodes.add(new Node("01", "00000001", "HOME", "01", "00000000", "00000002", "00000000"));
    }
    
    public void traverse(String data, int level) {
        System.out.println(data);
        // TODO parse json
    }
    
    private void visit(JSONObject parent, JSONObject node, int level) {
        id++;
        nodes.add(new Node("01", String.valueOf(id), node.getString("key"), String.valueOf(level), "", "", ""));
    }
    
    public List<Node> getNodes() {
        return nodes;
    }

    private static class Node {
        private String zshop, nodename, parentid, childid, nextid, nodeid, tlevel;
        
        public Node(String zshop, String nodeid, String nodename, String tlevel, String parentid, String childid, String nextid) {
            this.zshop = zshop;
            this.nodeid = nodeid;
            this.nodename = nodename;
            this.tlevel = tlevel;
            this.parentid = parentid;
            this.childid = childid;
            this.nextid = nextid;
        }
    }
}

When calling the invoke action I use this payload:

%dw 2.0
output application/java
---
{
    data: vars.rootMessage.payload as String,
    level: 1
}

But this yields following error:

"Cannot coerce Object { encoding: UTF-8, mediaType: text/json; charset=UTF-8, mimeType: text/json, raw: org.mule.weave.v2.el.SeekableCursorStream@50ecee52 } (org.mule.weave.v2.el.MuleTypedValue@511ba9cc) to String

5| data: vars.rootMessage.payload as String, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Trace: at main (line: 5, column: 7)" evaluating expression: "%dw 2.0 output application/java --- { data: vars.rootMessage.payload as String, level: 1 }".

I tried a number of things:

  • Cast it to a ProductGroup object I wrote in Java
  • Try to cast the object retrieved to org.json.JSONObject
  • Try to buffer and read vars.rootMessage.payload (Binary)

But I wasn't able to solve it with any of these issues.

DataWeave approach My .dw Script

%dw 2.0
fun append
(item: Object, acc: Object = {
}) = acc ++ item

fun mapper(item: Object) = 
{
    Zshop: "01",
    Nodeid: "00000000",
    Nodename: item.key as Number as String {format: ""},
    Tlevel: "02",
    Parentid: "00000000",
    Childid: "00000000",
    Nextid: "00000000"
}
    
fun traverse(a: Array, level: Number) = 
    a map $ flatMap(value, index) -> value
    
output application/json
---
{
    test: payload.productGroups reduce (item, acc) -> append(mapper(item), acc)
}

Where I tried to solve some of the problems. mapper(item) should create json objects that I can append to the final output with appender(item, acc). Recursion has been sketched, but is not my main concern yet.

This yields this result:

(original payload),
"Zshop": "01",
"Nodeid": "00000000",
"Nodename": "800",
"Tlevel": "02",
"Parentid": "00000000",
"Childid": "00000000",
"Nextid": "00000000",
"Zshop": "01",
"Nodeid": "00000000",
"Nodename": "110",
"Tlevel": "02",
"Parentid": "00000000",
"Childid": "00000000",
"Nextid": "00000000",
"Zshop": "01",
"Nodeid": "00000000",
"Nodename": "720",
"Tlevel": "02",
"Parentid": "00000000",
"Childid": "00000000",
"Nextid": "00000000",
"Zshop": "01",
"Nodeid": "00000000",
"Nodename": "710",
"Tlevel": "02",
"Parentid": "00000000",
"Childid": "00000000",
"Nextid": "00000000",
...

Where I wonder why I'm getting a flat result without any object structure.

My questions:

  • Java: Why can't I cast the String or how is it done properly
  • DataWeave: Is there an easy solution I don't see?
  • Why is it a flat result and not an object?
  • Are the usages of the reduce and flatMap functions correct for this purpose?

Any help and / or feedback is welcome.


Solution

  • Java: Why can't I cast the String or how is it done properly

    JSON is not String. Use write(payload,'application/json') to have String.

    DataWeave: Is there an easy solution I don't see?

    Just pass the object, It is Map in Java. Since it is tree - each branch is another Map inside this Map.

    Why is it a flat result and not an object?

    It is ALWAYS Object. There are no other things in Java world.

    Are the usages of the reduce and flatMap functions correct for this purpose?

    No. mapObject and recursion should be good approach.