I'm writing a node.js app which needs to get some data from a list of pages from a provider:
var list = [
{ url: 'http://www.example.com/1' },
{ url: 'http://www.example.com/2' },
...
{ url: 'http://www.example.com/N' },
];
Currently I'm using async.each, which works nicely:
async.each(
list, // 1st param is the array of items
function(elem, callback) { // 2nd param is the function that each item is passed to
request(elem.url, function (error, response, body) {
if (!error && response.statusCode == 200) {
console.log(body);
}
}),
},
function(err) { // 3rd param is the function to call when everything's done
if (err) {
console.error('Error in the final async callback:', err);
}
}
);
The only problem is that the site's server some times (understandably) responds with a 403 (forbidden) status code, due to an excess of requests from the same IP in the time unit...
I see async
provides a whilst()
method, too, whose example is:
var count = 0;
async.whilst(
function () { return count < 5; },
function (callback) {
count++;
setTimeout(callback, 1000);
},
function (err) {
// 5 seconds have passed
}
);
But I don't see how to use it to use it with a list, or how to use it combined with async.each
... :-(
So the answer is: How do I limit (throttle) a list of async requests in node.js?
P.S.: To be clearer, I don't want (if possible) to queue the requests, since a request could possibly take a long time to complete...: I just want the requests to be initiated at defined temporal intervals (say 5 ~ 10 seconds between each request...).
UPDATE:
After alireza david comment, I did try using async.eachLimit, which looked very promising, to me... This is an example of it's usage, on the module github site:
async.eachLimit(
obj.files,
limit
function (file, complete) {
complete();
},
function (err) {
}
);
But the limit usage is not documented, and it's not clear to me... If anybody has any clue...
Most of the time 403 means you should limit your requests, Because web server thinks you doing DDOS attack.
In this situation you should async.eachLimit()
async.eachLimit(obj.files, 1000,
function (file, complete) {
complete();
},
function (err) {
});
UPDATE
I think got it, The limit
options is number of concurrence requests.
You should decrease this number (My opinion is 2 or 3 just for test)