I'm trying to design a system similar to Twitter's timeline, but I can't wrap my head around how to get updates from so many followers while remaining efficient. Let's say I'm following 1000 people on Twitter. When I go to my feed, how does it know which tweets to show me? This is what I'm thinking, but it seems extremely inefficient and unlikely:
You have 10,000 friends.
In a for loop, loop through each friend, getting their latest
status updates since their last update.
But that just seems ridiculous to loop through 10,000 friends. I can't imagine how else they'd do it though. Or would it be something like:
Someone I am following posted a tweet. That tweet is inserted in
an array containing the tweets of all people I am following.
But then that would seem weird, if I followed someone new who has 20,000 tweets, then 20,000 tweets would be inserted in my array, and if that person has millions of followers, then there are a million X 20,000 copies of the same set of tweets. So that also seems unlikely.
Anyone have any ideas how they could possibly do it?
I advice you to check the twissandra project they have implemented all the basic functionality of twitter using cassandra , a nosql database. It is said twitter is no longer using it for tweets .
The old implementation can be consulted here