Please bear with me while I provide a short background:
- My application retrieves user data from Facebook and LinkeIn.
- Both have very strict terms of use. Specifically, they do not
allow to save user data on my application (well, kind-of.
Facebook allows caching. Linked does not allow even caching).
- The naive solution is to call Facebook/LinkedIn whenever I need the
data from them. The problem is that this becomes too slow if I need
lots of data (e.g. profiles of 100 users). Batch calls make things
better, but they have limits and I'm not sure this approach can
scale.
So the question is how to make my application run fast while using data from Facebook/LinkedIn?
If you can share from your experience, or have an example for a site that uses lots of data from Facebook/LinkedIn I'd love to hear.
When you talk about making your application "fast", please note "fast" can mean either "high throughput" or "low latency", and there is a big difference between the two. It would be good to set performance goals for both latency (how quickly each individual user should be served) and throughput (how many users you should be able to serve per unit time).
If getting data from FB/LinkedIn is a bottleneck for throughput,
- Use batched requests.
- If the same data can be used repeatedly, cache as much of it as you can and for as long as you can.
- Make sure your application is able to issue many requests in parallel. Since sending a query over the network is a high-latency operation, you can gain a LOT of throughput this way.
If getting data from FB/LinkedIn is a bottleneck for latency,
- Again, cache as much data as you can locally.
- If it is possible to "guess" which data will be needed soon, pre-fetch it. (For example: if your user has to fill out a big form and then submit, but there is a critical field in that form which identifies the data you will need, you could use AJAX to make the page send that field back as soon as it is filled in, rather than waiting for the whole form to be submitted!)
- If you need multiple pieces of data to service a single user, DON'T fetch them with multiple sequential requests -- either use a batched request, or multiple parallel requests (perhaps both).
- When it is necessary to make the user wait, "distract" them -- do whatever you can to make the wait seem shorter.
- If there are certain parts of a page which can be rendered without FB/LinkedIn data, send those parts back first, and use AJAX calls to fill in the other parts when the data is ready.
If you absolutely must have data XYZ from FB/LinkedIn to serve a user, and the latency of a single API request is N seconds, and your target maximum time to serve each user is < N seconds, the only possible way you can reach your goal is by prefetching data. Maybe when you see the very first page request come in for a user (say for the home page), you can start loading all the data which will be needed for that user into the cache (if it's not already there).
Whatever you do, I recommend that you encapsulate your FB/LinkedIn data access code inside a "data access layer". Caching should happen strictly inside the data access layer -- the application code doesn't need to know about the cache. Whether you use batched calls or not, and whether you issue multiple calls in parallel is also an implementation details which should be kept strictly inside the data access layer.