I have a very basic news feed modelled in IBM Graph (TitanDB backed by Cassandra) as shown below:
I am trying to write a query that does the following:
USER: John.Smith
FRIENDS
combined with his own.USER: John.Smith
likes any of those posts and return as a simple is_liked
boolean property for each post.There are a couple of pre-requisites for this query:
USER
should also be returned. For the sake of this question, only the avatar
property is required.I have no problem getting the users friends, and their LATEST_POSTS
:
g.V().hasLabel("USER").has("userid", "John.Smith").both("FRIEND").out("LATEST_POST");
I have read the Tinkerpop documentation but am finding myself still lost as to how to begin building upon this query in order to meet my requirements.
Also, any commentary on this approach in terms of performance, data modelling, schema or indexing advice would be extremely helpful. i.e Should I expect this approach to be able to retrieve feeds in real-time at scale?
Thanks in advance.
For the given graph schema, the query would be something like this:
g.V().has("user", "userid", "John.Smith").as("john").
union(identity(), both("FRIEND")).as("user").
out("LATEST_POST").
flatMap(emit().repeat(out("PREVIOUS_POST")).range(page * pageSize, (page + 1) * pageSize)).as("post").
choose(__.in("LIKED").where(eq("john")), constant(true), constant(false)).as("likedByJohn")
select("user", "post", "likedByJohn")
But Alaa already pointed out that this approach won't scale and how you could improve your graph schema.