Although I am not using Neo4j, and instead using TitanDB (IBM Graph), due to the fact that I am new to graph databases, I have modelled a basic news feed using the schema suggested in the Neo4j documentation, for now.
http://neo4j.com/docs/snapshot/cypher-cookbook-newsfeed.html
Having fully read all the documentation, I am aware of several key differences between the way these databases operate.
In the model described in the link, each of a users posts
are stored as vertexes
connected by edges
to each other, forming a long list of status updates emanating out from each user
vertex.
While this makes sense given Neo4j's capabalities I am aware that TitanDB has vertex-centric
indexing abilities, described in detail here:
http://s3.thinkaurelius.com/docs/titan/1.0.0/indexes.html
Right now I am trying to ensure that querying for a given users feed is optimal, for a large graph with lots of users, and with lots of permanently kept posts or status updates. Therefore, I would like to avoid having to traverse all the posts, of all of a users friends, then finally order and limit them, just in order to get the first 15 items of a users feed.
As such, I am unsure if the model described in the Neo4j documentation is really the best one to use with TitanDB, so my question is as follows:
post
vertex directly to the user
who posted it, and use a vertex-centric
index on the time
property of each posted
edge?I'm really after some general advice on modelling, indexing and retrieving a basic newsfeed in Titan DB. Thanks in advance.
The basic schema doesn't seem like a bad approach, though it's difficult to make a good judgement based on this one use case.
The simplest approach to solving your indexing problem is probably to denormalize a bit - store the user id as a property on the post
vertex and create and index on the [user, timestamp]
pair.
Vertex centric indexes might help you, but not in the proposed model - you'd need to model post
as an edge, node a vertex, which may make other traversals rather awkward. Furthermore, IBM Graph does not support vertex centric indexes as of its current release.