Search code examples
rweb-scrapingrvesthttr

Click button using R + httr


I'm trying to scrape randomly generated names from a website.

library(httr)
library(rvest)
url <- "https://letsmakeagame.net//tools/PlanetNameGenerator/"
mywebsite <- read_html(url) %>%
    html_nodes(xpath="//div[contains(@id,'title')]")

However, that does not work. I'm assuming I have to «click» the «generate» button before extracting the content. Is there a simple way (without RSelenium) to achieve that?

Something similar to:

POST(url,
 body = list("EntryPoint.generate()" = T), 
 encode = "form") -> res
res_t <- content(res, as="text")

Thanks!


Solution

  • rvest isn't much of a help here as planet names are not requested from a remote service, names are generated locally with javascript, that's what the EntryPoint.generate() call does. A relatively simple way is to use chromote, though its session/process closing seems kind of messy at the moment:

    library(chromote)
    b <- ChromoteSession$new()
    {
      b$Page$navigate("https://letsmakeagame.net/tools/PlanetNameGenerator")
      b$Page$loadEventFired()
    }
    
    # call EntryPoint.generate(), read result from <p id="title></p> element,
    # replicate 10x
    replicate(10, b$Runtime$evaluate('EntryPoint.generate();document.getElementById("title").innerText')$result$value)
    #>  [1] "Torade"      "Ukiri"       "Giconerth"   "Dunia"       "Brihoria"   
    #>  [6] "Tiulaliv"    "Giahiri"     "Zuthewei 4A" "Elov"        "Brachomia"
    
    b$close()
    #> [1] TRUE
    b$parent$close()
    #> Error in self$send_command(msg, callback = callback_, error = error_, : Chromote object is closed.
    b$parent$get_browser()$close()
    #> [1] TRUE
    

    Created on 2023-01-25 with reprex v2.0.2