Search code examples
phptwitter

PHP fetch all Twitter Followers and compare them to friends


I am looking for scalable way to do the following:

  • User login
  • Fetch all Friends from Twitter
  • Fetch all Followers from Twitter
  • Display all Friends which aren't Followers

The Problem: How can this be done in a scalable way? An user can have up to 2 million friends or followers. Currently I'm storing both inside an SQLite table and compare them through a loop. When the user comes back the table is cleared and process starts again.

This works fine on 100 - 1000 Friends, but will be tricky with 500000 Friends. I can't cache the lists because they can change every moment.

Does anyone know a good way to handle such big amount of data?


Solution

  • I don't know what your database looks like, but this is how I would set it up.

    CREATE TABLE twitter_users (
        user_id INTEGER PRIMARY KEY NOT NULL,
        screen_name VARCHAR(20) NOT NULL
    );
    
    CREATE TABLE friends (
        friend_id INTEGER PRIMARY KEY NOT NULL
    );
    
    CREATE TABLE followers (
        follower_id INTEGER PRIMARY KEY NOT NULL
    );
    

    Then you can use this SQL to get the friends who are not followers.

    SELECT friend_id, screen_name
    FROM friends
    LEFT JOIN followers ON follower_id = friend_id
    LEFT JOIN twitter_users ON user_id = friend_id
    WHERE follower_id IS NULL
    

    If the screen name is NULL it means they are not in your twitter_users table. You can look up the missing users and store them for later. Screen names can change so you might need to update the table periodically.

    Use the friends/ids and followers/ids APIs to get a list of friend and follower ids 5,000 at a time. Use the users/lookup API to get up to 100 screen names. If a user has 2,000,000 friends it will take 400 api calls to get the list of ids so you should still cache the list at least for popular users.