Search code examples
c#twitterlinq-to-twitter

Most efficient way to find inactive Twitter followers using LINQ to Twitter


The number of queries allowed by the Twitter API is limited. On the other side the definition of "inactive user" may imply a different algorithm with an impact on request number.

I'm looking for the most efficient way, in number of queries and quality of "inactivity", to find the inactive followers using LINQ to Twitter.


Solution

  • As you must have learned by now, rate-limits and count restrictions prevent a lot of operations on the Twitter API. With these constraints, most answers will be less than adequate, but here's a general approach I would use:

    1. Get the list of all follower IDs, using the Listing Followers query. Make sure you max out Count at 5000 to reduce the number of queries. If you have users with hundreds of thousands (or even millions) of followers, this isn't optimal, but is still the most efficient option.
    2. With that list, you can do Querying User Details queries. The situation here is even worse because the max number of comma-separated user IDs is 100. Here you might consider keeping track of UserIDs to classify them by activity/date of last scan to avoid re-visiting users that you already know are inactive.
    3. That last query will give you User entities. Each User entity has a Status property for the user's most recent tweet. One idea might be to examine the CreatedAt date to determine whether to query that user any further. e.g. is that last tweet was N months ago, the user is probably inactive.
    4. Use ApplicationOnlyAuthorizer when you can because it gives you higher rate limits.
    5. Your rate limit windows are 15 minutes. Create pipelines by performing a certain query type for the limit and queue the results for the next task in the chain. Let the next task use it's limit and keep going from there.

    One of the things about this is how you define "Active" and "Inactive" because there might be edge cases. e.g. What if you have folks that don't tweet much, but they DM, favorite, or RT. You'll have to do queries on a user's activity to pull out that extra data. Hopefully, this will either validate what you already know or maybe add an idea or two that could be useful.

    Note: Consider Gnip if you're willing to pay and avoid the rate limits.