Search code examples
gremlintinkerpopamazon-neptune

Map gremlin projection results


Imagine an item can be tagged by connecting unique tag nodes to a post with the twist that tags come from different sources. The source is recorded on the connecting edge, and there may be only one edge between a post and tag corresponding to the same source. This query returns posts and their tags with the tag containing details of source and score:

  project('title', 'tags').by('title').
    by(inE().as('rel').
      hasLabel('tags').project('value', '_rel').
        by(outV().hasLabel('tag').values('value')).
        by(project('score', 'source').
            by('score').by('source')).
      group().
        by(values('value'))
          
    )

The result is in the following shape:

[
  {
    "title": "x",
    "tags": {
      "dog": [
        {
          "value": "dog",
          "_rel": {
            "score": 5,
            "source": "user"
          }
        }
      ],
      "cat": [
        {
          "value": "cat",
          "_rel": {
            "score": 3,
            "source": "user"
          }
        },
        {
          "value": "cat",
          "_rel": {
            "score": 2,
            "source": "google"
          }
        }
      ]
    }
  }
]

Question is: how can this shape be transformed to the more compact form below?

{
  "title": "x",
  "tags": [
    {
      "value": "dog",
      "_rel": [
        {
          "score": 5,
          "source": "user"
        }
      ]
    },
    {
      "value": "cat",
      "_rel": [
        {
          "score": 2,
          "source": "google"
        },
        {
          "score": 3,
          "source": "user"
        }
      ]
    }
  ]
}

This is a typescript (and Neptune) implementation so some limitations apply.

A minimal sample graph is available at https://gremlify.com/khvizyzeqg and below:

  property(single, 'value', 'cat').addV('tag').
    as('2').
  property(single, 'value', 'dog').addV('post').
    as('3').
  property(single, 'title', 'x').addE('tags').
  from('1').to('3').property('score', 3).
  property('source', 'user').
  property('value', 'cat').addE('tags').
  from('1').to('3').property('score', 2).
  property('source', 'google').
  property('value', 'cat').addE('tags').
  from('2').to('3').property('score', 5).
  property('source', 'user').
  property('value', 'dog')

Solution

  • This first answer isn't quite what you asked for because the format doesn't exactly match the example you presented, however I think it's worth considering as usable since it offers a more compact data structure with the same content and the Gremlin is quite readable:

    gremlin> g.V().hasLabel('post').
    ......1>   project('title', 'tags').
    ......2>     by('title').
    ......3>     by(inE().hasLabel('tags').
    ......4>        group().
    ......5>          by('value').
    ......6>          by(project('score','source').
    ......7>               by('score').by('source').
    ......8>             fold()))
    ==>[title:x,tags:[cat:[[score:2,source:google],[score:3,source:mobius]],dog:[[score:5,source:mobius]]]]
    

    This second answer produces output that should match your requested output, but as you can see it forces you to destructure the Map above just to re-write it back to match the form you wanted:

    gremlin> g.V().hasLabel('post').
    ......1>   project('title', 'tags').
    ......2>     by('title').
    ......3>     by(inE().hasLabel('tags').
    ......4>        group().
    ......5>          by('value').
    ......6>          by(project('score','source').
    ......7>               by('score').by('source').
    ......8>             fold()).
    ......9>        unfold().
    .....10>        project('value','_rel').
    .....11>          by(keys).
    .....12>          by(values).
    .....13>        fold())
    ==>[title:x,tags:[[value:cat,_rel:[[score:2,source:google],[score:3,source:mobius]]],[value:dog,_rel:[[score:5,source:mobius]]]]]