Search code examples
titangremlintinkerpoptinkerpop3ibm-graph

How can I retrieve and paginate a users feed in IBM Graph (TitanDB) using Gremlin/Tinkerpop


I have a very basic news feed modelled in IBM Graph (TitanDB backed by Cassandra) as shown below:

enter image description here

I am trying to write a query that does the following:

  1. Start at vertex USER: John.Smith
  2. Get the 15 most recent posts from the users FRIENDS combined with his own.
  3. Check to see if USER: John.Smith likes any of those posts and return as a simple is_liked boolean property for each post.

There are a couple of pre-requisites for this query:

  • In each returned post, the properties of the posting USER should also be returned. For the sake of this question, only the avatar property is required.
  • I need to be able to paginate these results. i.e. Once I have retrieved the top 15 posts, I then need to be able to return the next 15, then the next etc.

I have no problem getting the users friends, and their LATEST_POSTS:

g.V().hasLabel("USER").has("userid", "John.Smith").both("FRIEND").out("LATEST_POST");

I have read the Tinkerpop documentation but am finding myself still lost as to how to begin building upon this query in order to meet my requirements.

Also, any commentary on this approach in terms of performance, data modelling, schema or indexing advice would be extremely helpful. i.e Should I expect this approach to be able to retrieve feeds in real-time at scale?

Thanks in advance.


Solution

  • For the given graph schema, the query would be something like this:

    g.V().has("user", "userid", "John.Smith").as("john").
      union(identity(), both("FRIEND")).as("user").
      out("LATEST_POST").
      flatMap(emit().repeat(out("PREVIOUS_POST")).range(page * pageSize, (page + 1) * pageSize)).as("post").
      choose(__.in("LIKED").where(eq("john")), constant(true), constant(false)).as("likedByJohn")
      select("user", "post", "likedByJohn")
    

    But Alaa already pointed out that this approach won't scale and how you could improve your graph schema.