I want to make some kind of crawling program. it's chrome extension.
In my program, one function is visit many webpage. This is my plan.
visit page, load page, crawling, visit next page....
I want my program is sequential. but, 'load page' is async. therfore, lower code is not work.
function load_and_download(checked_list) {
for(var item of checked_list){
visit_page(item); //async function
crawl_page(item);
}
}
I found solution like this.
function load_and_download(checked_list, now_index) {
visit_page(checked_list[now_index], function () {//load finish
crawl_page(checked_list[now_index]);
if (checked_list.length > now_index + 1)
load_and_download(checked_list, now_index+1);
})
}
upper code work for me. but, it is recursive function. if checked_list is very very long, upper code is safe? I know recursive code has stack problem. javascript is safe from that?
What you have here is not a recursive function, if visit_page
calls back asynchronously. This pattern is something which I'd like to call pseudorecursion, as due to the async callback, every call to load_and_download
will happen in a separate task, thus inside the callstack there will be only one call to that function (per task). Once the async action is scheduled in visit_page
, the callstack unwinds again and the next task will be processed. Therefore although it looks like recursion, there's actually no recursion going on.
Here's a simplified example illustrating this:
function asynchronous(callback) {
// Here an asynchronous action gets started and processed somewhere
setTimeout(callback, 1000);
// Execution continues synchronously and the callstack unwinds from here on
}
function pseudoRecursive() {
asynchronous(function () {
console.log("As we're in a new task, the callstack is basically empty:\n", (new Error()).stack);
pseudoRecursive();
});
// Here the callstack unwinds again
}
pseudoRecursive();