Search code examples
databasedatabase-designcomplex-networks

Are there Database Systems more suited to Social Networks?


This question is inspired by the article "Why are Facebook, Digg, and Twitter so hard to scale?" on highscalability.com

So what database systems(however obscure) are out there that would be able to handle this type of data better?


Solution

  • Having a database system where the data model is tailored for the data structure you are trying to represent is often advantageous. Social networks lend themselves very well to Graph databases, such as Allegro Graph, Neo4j etc.

    There is a good article at the Neo4j blog on how to represent social networks in a graph database, with the examples using Neo4j.

    The benefit of graph databases is that data is stored so that traversing connections in between entities is a very fast operation, allowing you to traverse complex networks quickly. These operations would typically be (at best) expensive join operations in current implementations of relational databases. As with relational databases, graph databases still have a slight problem with scaling out to multiple hardware nodes. However the need for multiple hardware nodes should be much less with a graph database than with a relational database for Social Network kinds of data, a few billion nodes on a single machine is no problem. Scaling out to multiple hardware nodes is where key-value stores shine, since entities in a key-value store completely isolated from each other. The problem here is instead that nothing is isolated in a social network, meaning that to emulate the connections multiple queries to the database are required, one for each entity. This will be slow, especially for friend-of-a-friend kinds of queries, where you only discover one level of friends with each query.

    Disclaimer: I am a member of the Neo4j team.