I want to scrape the product links (href) in this webpage. https://www.artfinder.com/editors-picks/theme/amazing-techniques/blurred-lines/#/
I am working with r and cannot figure out what is the right selector to write in the html_nodes(). I tried ".fit-in" , "a.af-place.fit-in" and they won't give the links.
Could you help me please?
The structure of this page does not work with selectors in rvest
. If you use something like Chrome's developer tools, you can examine the resources used by the page and it turns out there is an API that returns the data in JSON format.
So, one way of getting the data you need (which will need a little more tidying up) would be...
library(jsonlite)
prod_url <- "https://www.artfinder.com/api/theme/amazing-techniques/blurred-lines/products/?page=1&paginate=1000&sort=best_match&limit=1000"
prods <- fromJSON(prod_url)$results
This returns a dataframe with lots of information, including a column containing the urls.