Search code examples
mysqlredisfeed

Using Redis as a Key/Value store for activity stream


I am in the process of creating a simple activity stream for my app.

The current technology layer and logic is as follows:

** All data relating to an activity is stored in MYSQL and an array of all activity id's are kept in Redis for every user.**

  1. User performs action and activity is directly stored in an 'activities' table in MYSQL and a unique 'activity_id' is returned.
  2. An array of this user's 'followers' is retrieved from the database and for each follower I push this new activity_id into their list in Redis.

When a user views their stream I retrieve the array of activity id's from redis based on their userid. I then perform a simple MYSQL WHERE IN($ids) query to get the actual activity data for all these activity id's.

This kind of setup should I believe be quite scaleable as the queries will always be very simple IN queries. However it presents several problems.

  1. Removing a Follower - If a user stops following someone we need to remove all activity_id's that correspond with that user from their Redis list. This requires looping through all ID's in the Redis list and removing the ones that correspond to the removed user. This strikes me as quite unelegant, is there a better way of managing this?
  2. 'archiving' - I would like to keep the Redis lists to a length of say 1000 activity_id's as a maximum as well as frequently prune old data from the MYSQL activities table to prevent it from growing to an unmanageable size. Obviously this can be achieved by removing old id's from the users stream list when we add a new one. However, I am unsure how to go about archiving this data so that users can view very old activity data should they choose to. What would be the best way to do this? Or am I simply better off enforcing this limit completely and preventing users from viewing very old activity data?

To summarise: what I would really like to know is if my current setup/logic is a good/bad idea. Do I need a total rethink? If so what are your recommended models? If you feel all is okay, how should I go about addressing the two issues above? I realise this question is quite broad and all answers will be opinion based, but that is exactly what I am looking for. Well formed opinions.

Many thanks in advance.


Solution

  • 1 doesn't seem so difficult to perform (no looping):

    delete Redis from Redis
     join activities on Redis.activity_id = activities.id
                    and activities.user_id = 2 
                    and Redis.user_id = 1
    ;
    

    2 I'm not really sure about archiving. You could create archive tables every period and move old activities from the main table to an archive table periodically. Seems like a single properly normalized activity table ought to be able to get pretty big though. (make sure any "large" activity stores the activity data in a separate table, the main activity table should be "narrow" since it's expected to have a lot of entries)