Search code examples
javascriptnode.jsrestapi-designrestify

Node.js API - Duplicate data requests from multiple clients


I have a Node.JS api using Restify which is basically an aggregater for several other API's for use inside a Google sheet.

When the Google sheet is initially opened, it makes several requests to my server to request a bunch of data from various API's, which my server then dutifully looks up and returns.

I have implemented rudimentary memory based caching - If a request for the same data comes in it will serve it from memory until an expiry time is reached (I'm open to moving to Redis for this soon).

My issue is that quite regularly a second request for the same data will come in while the first request is still being looked up/parsed/served, meaning I'm requesting the same (several gigabytes) of data in parallel.

How can I effectively pause the second request and have it wait until the data is available from the first request? I don't mind having a high timeout and waiting for the first request to end before the second starts, or alternatively some sort of "back off and try again in a minute" logic is doable.

I imagine some sort of promises or saving callbacks somewhere would be best for this, but I'm not sure what kind of best practices or suggested methods there are for this.

It is not possible for me to regularly request and cache the data server-side as the potential range of values that clients can request is fairly high.


Solution

  • Keep a cache of promises. If a request is in progress, the cache will indicate it was already requested and you can still await the unresolved promise, then respond to both requests when the data is ready.

    The rest of your code will need to be refactored to await the values in the cache regardless of whether the promises are resolved or not, and to add entries to the cache when a request is initiated rather than completed.

    If a promise in the cache is already resolved, then await will only stall the asynchronous flow of your user function by a single tick in the event loop, meaning you get the advantage of seeing pending requests basically for free.