I'm trying to pull in data by using read_html. read_html takes a url. The url string is concatenated as 'x'. Since the concatenation uses '""quotation marks""', i have to use print and set quotes = FALSE to get rid of the back slashes (see screenshot below).
once I plug in x with the remaining read_html command, I get an error. Is there a better way to go about this?
updates:
> x<-paste0('"https://www.govtrack.us/congress/bills/', bills[15821,4],"/",bills[15821,1],'"')
> x
[1] "\"https://www.govtrack.us/congress/bills/118/HR8774\""
> print(x, quote = FALSE)
[1] "https://www.govtrack.us/congress/bills/118/HR8774"
> read_html(print(x, quote=FALSE)%>% html_nodes("#UserPositionModal+ p") %>% html_text())
[1] "https://www.govtrack.us/congress/bills/118/HR8774"
Error in UseMethod("xml_find_all") :
no applicable method for 'xml_find_all' applied to an object of class "character"
> read_html("https://www.govtrack.us/congress/bills/118/hr8774")%>% html_nodes("#UserPositionModal+ p") %>% html_text()
[1] "Making appropriations for the Department of Defense for the fiscal year ending September 30, 2025, and for other purposes."
You don't need an extra set of quotes when the variable itself is a string. (I think that's because it already evaluates with quotes?)
x <- paste0("https://www.govtrack.us/congress/bills/", "118", "/", "hr8774")
x
#[1] "https://www.govtrack.us/congress/bills/118/hr8774"
rvest::read_html(x) |>
rvest::html_nodes("#UserPositionModal+ p") |>
rvest::html_text()
#[1] "Making appropriations for the Department of Defense for the fiscal
#year ending September 30, 2025, and for other purposes."