This is the web page where I'm trying to get the information I need: https://www.immobiliare.it/ricerca-mappa/Torino,TO/#/linkZona_/latitudine_45.04463/longitudine_7.68199/idContratto_1/idCategoria_23/zoom_16/pag_1
and this is the XPath associated to the node I'm intreseted:
//*[@id="box-listing"]/div[1]
While using
out %>%html_node(xpath = '//*[@id="box-listing"]/div[1]')
I ge the following error
{xml_missing}
<NA>
To solve your problem I suggest you to use Rselinium
We have two big families of web sites. The static web site and the dynamic web site.
The first one has the infomation that we need inside the code (for example Wikipidia web page), instead the second one doesn't have actually the information inside the code, but it makes it through the Javascript code every time we need it (for example Trip Advisor).
Thanks to Rselenium
library we are able to scrape information from a dynamic web site.
What is Selenium?
RSelenium
is a R library, but we can find it in Python
, Java
and so in other types of code and it is able to emulate the human behaviour.
The principal use of Selenium
is to test the application automatization, but is not that case.
Selenium is a very big world ( here to deep).
About Rselenium I suggest you to check these links:
Below a small example using Rselenium about your question:
library(RSelenium)
#We start the RSelenium environment
driver <- rsDriver(browser=c("firefox"),port = 4445L)
remote_driver <- driver[["client"]]
#We send the url to the firefox browser
remote_driver$navigate("https://www.immobiliare.it/ricerca-mappa/Torino,TO/#/linkZona_/latitudine_45.04462/longitudine_7.68199/idContratto_1/idCategoria_23/zoom_16/pag_1")
Below some example of the Rselenium powerful
#We get the text
text_1<-remote_driver$findElement(using = "css selector", '#box-listing > div:nth-child(1) > div:nth-child(1)')$getElementText()
print(text_1)
[[1]]
[1] "PREMIUM\nImmobile\n€ 150.000\n60 m² • 2 locali"
#We click the element
remote_driver$findElement(using = "css selector", '#box-listing > div:nth-child(1) > div:nth-child(1)')$clickElement()