Search code examples
node.jsgremlinamazon-neptune

Project to Tree (looking like JSON) using gremlin


I'm trying to project a JSON-tree with the data that I have in my Tinkerpop 3.5 database (running on local container, aws neptune is used when deployed). I have tried using .tree() after my repeat, but when I use that (using gremlin for nodejs) - I get a structure with keys that is difficult to work with, it adds @type / @values as keys and everything is structured in arrays (including the values and key names)

What I want to achieve looks like the code below - but I'd like to "repeat" the __.out steps, until there are no more steps to take (the entire tree)... I have tried using repeat but, somehow I need to reference the repeated value and put it in the "children" projection.

        const v = await this.g.V(partnerId)
            .out('uses_process').project('title', 'children').by('title')
            .by(
                // This is what I want to "repeat" instead this hard-coded depth
                __.out('uses_process').project('title', 'children').by('title').by(
                    __.out('uses_process').project('title', 'children').by('title').by(
                        __.out('uses_process').project('title').by('title'),
                    ).fold(),
                ).fold(),
            )
            .next();

Expected output (and the output I get from the above code, although hard-coded to 3 levels deep):

    {
      "title": "1st Level",
      "children": [
        {
          "title": "2nd level",
          "children": [
            {
              "title": "3rd level",
              "children": {
                "title": "4th level"
              }
            }
          ]
        }
      ]
    }

Here is some sample data

g.addV('Process').property('title', '1st level').as('root').
addV('Process').property('title', '2nd level').as('2nd').addE('uses_process').from('root').to('2nd').
addV('Process').property('title', '3rd level').as('3rd').addE('uses_process').from('2nd').to('3rd').
addV('Process').property('title', '4th level').as('4th').addE('uses_process').from('3rd').to('4th')

Would it be possible to get the expected output, without using .tree() (and parsing that, which would be tricky with how the current response looks like).


Solution

  • The Node.js Gremlin client does not have a built in Tree object so what you see there is essentially the GraphSON representation of a tree. This is also true for other clients like the Python one. Only the Java (JVM) based clients have a native Tree object that can be de-serialized into.

    Using Node.js your best approach is most likely going to be to use some combination of group and project steps to build a data structure similar in shape to the one that you need.

    If you can add some sample data to the question (the addV and addE steps to build a sample graph) it will help provide a more in depth answer.

    UPDATED 2023-02-28 based on discussion in comments.

    When the group step is given a label, x in this case, it behaves as a side effect (passing through whatever flowed into it). This can be handy for situations like this. We can group using the depth of the repeat - loops to build a sort of tree like structure. Using your sample data:

    g.V().has('title','1st level').
      group('x').by(constant(-1)).by('title').
      repeat(outE().inV().group('x').by(loops()).by('title')).until(not(out())).
      cap('x')
    

    Which produces:

    {-1: ['1st level'], 0: ['2nd level'], 1: ['3rd level'], 2: ['4th level']}
    

    I used -1 for the root level as loops starts at zero, and we need to group before the repeat starts or we will either lose the first or last node (depending on where the group is placed) if it is located inside the repeat body.