Search code examples
architecturegeolocationsocial-networkingfeed

Location-Based Segmentation of a Social Network


I'm building a location based social networking app (mobile app frontend, Django/Python backend) with a very narrow focus (not trying to recreate FB). I'm using the Google Places API for location searching. Once a user finds a location via Google Places I allow them to perform an action on the location which is then saved to our server.

I've implemented the searching, user actions, and location storage successfully to this point. I've also implemented a news feed. I want that news feed to be populated with stories/actions that are geographically proximal to the user, noting that the user's location can change (but will probably stay within the same city) with each use of the app. I'm looking for help dynamically segmenting my social network based on location. Here is what I've thought of/come across so far:

  1. The obvious but very expensive way to do this would be sorting all the stories in a global news feed based on the user's current location and then taking nearest ones off the top.

  2. Or how about creating regions and splitting the regions as they grow (more actions, users, active locations). Then when the user polls for a news feed they get the feed of the region they are closest to. The splitting of regions would be something that could happen on a scheduled basis via cron job and would only happen once a region was active enough to be separated. Splitting might also be expensive though if we have to change the region references for each action/location/story each time a region is split.

  3. A spin off of number 1 would be sorting relevant locations based on the user's current coordinates but then caching the order. That way the next time a user within reasonable distance of those same coordinates wants to generate a feed its a far less expensive process.

  4. Keep it simple: just define hard regions and allow the user to select and change that region. So for example allow the user to Select Chicago, IL as a region and only see actions/stories inside that region. My concern here would be that they would miss out on relevant stories from just outside their region (ie. Gary, IN)

How do apps like Secret or Whisper or even Facebook solve this problem? For those of you with experience building location-aware social apps what approaches have you taken? Please feel free to link all relevant or helpful answers.


Solution

  • Well your app is checking user location and fetching the feeds related to his current locality. Here is a solution am proposing

    1. Reverse geocoding using this you can find your users locality every time he log in or uses your app you can fetch locality or city he is in and then you can offer him the feeds that belongs to same locality or city you need to save city in your data base for every feed.
    2. You use Reverse geocode and save locality every time a new news feed is inserted on your data base. And you can group these news feed by location and sort them in ascending order using time and then fetch.

    So in this scenario saving locality or city and then checking it to users current city is a good approach.

    For relevant news outside the locality you can tell them trending feeds by giving them trending feeds on a trending page(if mobile) or you can use ajax and embed a trending pagelet(for desktops and laptops) you can use memcache for this purpose since trending feed will be accessed frequently hence caching them would be good.

    As for how facebook solves this problem of optimization there are few things they do like

    1. They optimize there table layout for recency and archive older data out.
    2. They don't use centralized database and as for global query they use memcache.(In your case you can also cache latest feeds and remove feeds older than 24 hrs)
    3. They don't use sql joins.

    For further understanding you can use these sources

    1.How to optimize queries

    2.Facebook architecture video by Aditya Agarwal

    Scaling Sql has its limits but all big companies try to over come these limits via architecture (e.g master slave architecure) and caching they identify the nature of their data and then use effective architecture and caching scheme to tackle scalabiltiy.