I am trying to scrape the table in : WEB TABLE
I have tried copying the xpath but it does not return anything:
url = "https://www.barchart.com/options/stocks-by-sector?page=1"
pg = read_html(url)
pg %>% html_nodes(xpath="//*[@id=main-content-column]/div/div[4]/div/div[2]/div")
I found the following link and feel I am getting closer....
So by using the same process I found the updated link by watching the XHR updates:
url = paste0("https://www.barchart.com?access_token=",token,"/proxies/core-api/v1/quotes/",
Where the token is found within the scope:
token = "eyJpdiI6IjJZMDZNOGYwUDk4dE1OcVc4ekdnUGc9PSIsInZhbHVlIjoib2lYcWtzRi9VN3ovbzdER2NhQlg0KzJQL1ZId2ZOeWpwSTF5YThlclN1SW9YSEtJbG9kR0FLbmRmWmtNcmd1eCIsIm1hYyI6ImU4ODA3YzZkZGUwZjFhNmM1NTE4ZjEzNmZkNThmZDY4ODE1NmM0YTM1Yjc2Y2E2OWVkNjZiZTE3ZDcxOGFlZjMifQ"
However, I do not know if I am placing the token where I should in the URL, but when I ran:
fixture <- jsonlite::read_json(url,simplifyVector = TRUE)
I received the following error:
Error in parse_con(txt, bigint_as_char) :
lexical error: invalid char in json text.
<!doctype html> <html itemscope
(right here) ------^
The token needs to be sent as a request header named x-xsrf-token
not by pass to the parameters:
Also, the token value might change over sessions so you need to get it in the cookie. After that, convert the data to a data frame and get the result:
pg <- html_session("https://www.barchart.com/options/stocks-by-sector?page=1")
cookies <- pg$response$cookies
token <- URLdecode(dplyr::recode("XSRF-TOKEN", !!!setNames(cookies$value, cookies$name)))
pg <-
pg %>% rvest:::request_GET(
config = httr::add_headers(`x-xsrf-token` = token)
data_raw <- httr::content(pg$response)
data <-