Search code examples
gremlinamazon-neptune

Neptune Gremlin query to recommend users based on common AND positive rated content


The Gremlin query pasted below returns an ordered list of friends of a given user - the ordering is descending, based on "how many of the same movies have we rated". I would like to only compare those movies where rated.score > 5 (on a 10 point scale) the goal is to sort the results by "how many of the same movies have we rated positively". Thanks in advance!

  g.V('a2661f57-8aa7-4e5c-9c89-55cf9bxxxxx').as('self').
  sideEffect(out('rated').store('movies')). 
  out('friended').
  group(). 
    by(). 
    by(out('rated').where(within('movies')).count()). 
  order(local). 
    by(values,desc). 
    unfold().
  select(keys).
  project('id','label','username').
    by(id).
    by(label).
    by('username')

Solution

  • I may be missing some context of the "score". Are you just using the count of ratings between common movies as the "score"? Or is this a property on the 'rated' edge?

    If so, you would just add a where() clause to your query to only return the results that are greater than 5.

    where(select(values).where(is(gt(5)))).

    g.V('a2661f57-8aa7-4e5c-9c89-55cf9bxxxxx').as('self').
      sideEffect(out('rated').store('movies')). 
      out('friended').
      group(). 
        by(). 
        by(out('rated').where(within('movies')).count()). 
      order(local). 
        by(values,desc). 
        unfold().
      where(select(values).where(is(gt(5)))).   //added filter
      select(keys).
      project('id','label','username').
        by(id).
        by(label).
        by('username')
    

    If you wanted to filter on a ratings score that is included on a ratings edge, you would need to filter for this condition earlier in the query using:

    where(outE('rated').values('score').is(gt(5))).

    g.V('a2661f57-8aa7-4e5c-9c89-55cf9bxxxxx').as('self').
      sideEffect(out('rated').store('movies')). 
      out('friended').
      group(). 
        by(). 
        by(outE('rated')
                .where(values('score').is(gt(5)))  //filter on score
           .inV()
               .where(within('movies')).count()). 
      order(local). 
        by(values,desc). 
        unfold().
      select(keys).
      project('id','label','username').
        by(id).
        by(label).
        by('username')