I'm in the middle of prototyping a social network (using ROR 3) and decided to check out Neo4j and while it looks great, I have a question about scaling and performance in terms of design.
I've researched how Etsy puts together and activity feed (see http://www.slideshare.net/danmckinley/etsy-activity-feeds-architecture ), and understand how messaging queues can fan out activities (such as sharing a picture and making this activity available to your 500 or so friends in their news feed). I also understand how news feeds can be cached (memcache) and how lookups can be performed against Redis..
All in all, it seems that to make a high performance activity feed that scales well (and social network in general) the common pattern is to use sharding, horizontal scaling, memcache, rabbitmq, redis, Mongodb, innodb (mysql) etc - all in attempt to compensate for high volumes, disk reads, etc.. But this is quite a bit of overhead in terms of design..
Can Neo4J eliminate the need, at least early on, for such an arrangement? I mean is it so fast that I don't need to set a message queue for fan outs and messaging, don't need to set up "activities" cache for every action a user performs, and can use it to handle both ordering and storing messaging? Can a news feed like Facebook's be created with such a system, or is the high performance activity feed limited to basic status updates?
If those questions are too broad, let me ask it a different way: Could I write facebook or twitter using neo4j and eliminate the need for message queuing to fan out updates (instead I want to get a live stream of updates on the fly), memcache for newsfeeds, and cached activity feed objects? Or will I find myself doing the same thing or even more to handle hundreds of request per second?
I ask the because it would save quite a bit of time to use Neo4J if it can indeed handle high volumes without having to use the tricks Etsy, Twitter, and Facebook employ to maintain high performance.
Yes. In fact, it's been done already by Rene Pickhardt.