Search code examples
graphgremlin

how to match un-linked vertexes in Gremlin in connected graph


I have a simple graph, with people nodes (let's say over 10K people nodes) and rules nodes (about a handful of rules) that are NOT already linked, no edges exist between these two type of nodes. What i want to do is to create edges between them by matching properties value in Gremlin.

people nodes has four properties: name, age, state, registered. rule nodes has three property: age, state, registered.. different rule nodes would have different property value such as (registered=true, state=WA, age > 22) etc.

How do I write a gremlin that can link all the people nodes that have matching properties value with each of the rule nodes?


Solution

  • A sample graph, given the additional information in the comments, could look like this:

    g = TinkerGraph.open().traversal()
    g.addV('person').
        property('name','daniel').
        property('age',37).
        property('state','AZ').
      addV('person').
        property('name','howell').
        property('age',25).
        property('state','WA').
      addV('person').
        property('name','john').
        property('age',19).
        property('state','NV').
      addV('rule').
        property('state','WA').
        property('state','CA').
        property('state','OR').
        property('minimumAge',22).
      addV('rule').
        property('state','AZ').
        property('state','FL').
        property('state','TX').
        property('minimumAge',19).
      iterate()
    

    I'm not quite sure if the age > 22 was intentional, but let's assume that you're actually only looking for equality matches; then your query would be:

    g.V().hasLabel('people').as('person').
      V().hasLabel('rule').
        where(eq('person')).
          by(values('age','state','registered').fold()).
      addE('hasRule').
        from('person')
    

    To match the person and rule vertices, you would do something similar to this (adding more rules as you need them):

    g.V().hasLabel('person').as('person').
      V().hasLabel('rule').
        where(lte('person')).by('minimumAge').by('age').
        filter(values('state').where(eq('person')).by().by('state')).
      addE('hasRule').
        from('person').iterate()
    

    This query would add 2 edges in the sample graph (there's no rule for john as he's just too young):

    gremlin> g.V().outE().inV().path().by('name').by(label).by(valueMap())
    ==>[daniel,hasRule,[minimumAge:[19],state:[AZ,FL,TX]]]
    ==>[howell,hasRule,[minimumAge:[22],state:[WA,CA,OR]]]