I was crawling a quite unstable website, which sometimes collapse into 503 and could only be fixed when refreshed. So I created these code to ask my crawler to retry the 503 page until the content has been passed to a variable:
repeat{
info = NA
info = read_html(url2)
if(is.na(info) == F) {
break
}
}
info
But for some reason this does not work. The system still throw me this, which it should not:
Error in open.connection(x, "rb") : HTTP error 503.
> info
[1] NA
Sometimes it even gives me this, but under such condition the content could be passed to the variable info with no problem:
Warning messages:
1: In for (i in seq_along(cenv$extra)) { :
closing unused connection 6 (url)
2: In for (i in seq_along(cenv$extra)) { :
closing unused connection 5 (url)
How can I build a code to retry the 503 pages?
You need to capture the error, this should work:
counter = 0
repeat {
counter = counter + 1
info = tryCatch(
read_html(url2),
# if you want to capture warnings as well
warning = function(w) {
Sys.sleep(30)
NA
},
error = function(e) {
Sys.sleep(30)
NA
}
)
if(!is.na(info) | counter >= 10) {
break
}
}
This is also the gist of what purrr::insistently
does.