I am new to web scraping and I am trying to crap the movie budget data from IMDb. Here is my code:
budget=vector()
for(i in 1:50){
remDr$navigate('http://www.imdb.com/search/title?sort=moviemeter,asc&start=1&title_type=feature&year=2011,2011')
webElems=remDr$findElements('css selector','.wlb_lite+ a')
webElems[[i]]$clickElement()
b=remDr$findElements('css selector','.txt-block:nth-child(11)')
b_text=unlist(lapply(b, function(x){x$getElementText()}))
if(is.null(b_text)==T){
budget=c(budget,'NULL')
}
if(is.null(b_text)==F){budget=c(budget,'NULL')}
print(b_text)
}
On each page there are 50 movies. I want to click every link one by one and collect the corresponding budget data. If I do not run the code in loop, the code works well. But the code always returns 'Null' when I run it in a loop. I am afraid that is because the pages do not load completely in the loop. I tried to use 'setTimeout' and 'setImplicitWaitTimeout' commands but they do not work well. Can anybody help me out?
try
Sys.sleep(time in seconds)
for each loop instead of setTimeout.
That has solved problems like yours to me.