Search code examples
neo4jgremlinamazon-neptunegremlin-server

Need help and Want to know how we can write this type of Neo4j Cypher query into Gremlin query?


MATCH (m:movie)-[has_directed_by:has_directed]->(director:Director)

OPTIONAL MATCH(director)-[has_name:has_name]->(directorName:DirectorName)

OPTIONAL MATCH(director)-[has_type:has_type]->(directionType:DirectionType)

OPTIONAL MATCH(director)-[has_language:has_language]->(directionLanguage:DirectionLangugaue)

OPTIONAL MATCH (director)-[has_script:has_script]->(script:Script) 

OPTIONAL MATCH (m)-[has_songs:has_songs]->(songs:Songs)

WITH m,has_directed_by,director,has_name,directorName,has_type,directionType,
has_language,directionLanguage,has_script,script order by m.id ASC 

RETURN distinct m,collect(has_directed_by),collect(director),collect(has_name),
collect(directorName),collect(has_type),collect(directionType),collect(has_language),collect(directionLanguage),collect(has_script),collect(script),collect(has_songs),collect(songs);

How to write a gremlin query for this type of hard queries in cypher? Please help! I wanted to know how can we refer to that 'm' and 'director' again and again in the gremlin.

Example:

g.V().
  hasLabel('movie').as('m').
  outE('has_directed').
  inV().
  hasLabel('Director').as('director')

But need to refer to that 'm' and 'director' in the upcoming traversal of the graph. Please help


Solution

  • So, no pun intended, for conversions like this it's best to take it one step at a time. The first line is roughly equivalent to:

    g.V().
      hasLabel('movie').as('m').
      outE('has_directed').
      inV().
      hasLabel('Director').as('director')
    

    However, it's often possible to use more idiomatic Gremlin that avoids a lot of as steps using an approach built around a step such as project. You only need the outE and inV steps if you need to retrieve properties from the edge. Otherwise just out will suffice.

    Gremlin also has a match step that would allow you to convert the Cypher query in a similar, declarative style, but I prefer to use other Gremlin steps and avoid match in most cases.

    In Gremlin there is an optional step but there are also the coalesce and choose steps that allows you to write a sort of IF...THEN...ELSE type of query which is equivalent to OPTIONAL MATCH

    In terms of distinct the equivalent Gremlin step is dedup

    One way in Gremlin to combine a selection of optional things together is to use a union step.

    The fold step in Gremlin can be used in a similar way to the collect step in Cypher.

    As I do not have your data set, I used the air-routes data set to demonstrate the equivalent of an OPTIONAL MATCH essentially using a fold step to generate an empty list when there are no results.

    gremlin>   g.V('0','44').
    ......1>     project('start','neighbors').
    ......2>       by(id).
    ......3>       by(out().fold())
    
    ==>[start:0,neighbors:[]]
    ==>[start:44,neighbors:[v[8],v[13],v[20],v[31]]]  
    

    As mentioned above the fold step is similar to the Cypher collect function. In the example above, when there is no match, the neighbors are represented by an empty list. You can have as many sub parts (key name and by modulator) for a project step as needed to represent all of your optional cases.

    There are of course many ways to write the query depending on how you wish to represent the case where nothing was found. If you prefer to have a default value rather than an empty list you can use a coalesce step.

    gremlin>  g.V('0','44').
    ......1>     project('start','neighbors').
    ......2>       by(id).
    ......3>       by(coalesce(out(),constant('None')))
    
    ==>[start:0,neighbors:None]
    ==>[start:44,neighbors:v[8]] 
    

    Using a similar pattern with your data set, the basic building blocks might be:

     g.V().
        hasLabel('Movie').
        project('movie','director').
          by(valueMap().with(WithOptions.tokens)).
          by(out('has_directed').valueMap().with(WithOptions.tokens).fold())    
    

    The valueMap step will return all of the properties for a vertex (or edge). I added that to my example to simulate what Cypher will do. By default (without a step like valueMap) Gremlin will just return the vertex without its properties in many cases. The with tells Gremlin to also include the ID and label in the results. This is equivalent to the deprecated form valueMap(true).