I am trying to scrape the table in : WEB TABLE
I have tried copying the xpath but it does not return anything:
require("rvest")
url = "https://www.barchart.com/options/stocks-by-sector?page=1"
pg = read_html(url)
pg %>% html_nodes(xpath="//*[@id=main-content-column]/div/div[4]/div/div[2]/div")
EDIT
I found the following link and feel I am getting closer....
So by using the same process I found the updated link by watching the XHR updates:
url = paste0("https://www.barchart.com?access_token=",token,"/proxies/core-api/v1/quotes/",
"get?lists=stocks.optionable.by_sector.all.us&fields=symbol%2CsymbolName",
"%2ClastPrice%2CpriceChange%2CpercentChange%2ChighPrice%2ClowPrice%2Cvolume",
"%2CtradeTime%2CsymbolCode%2CsymbolType%2ChasOptions&orderBy=symbol&orderDir=",
"asc&meta=field.shortName%2Cfield.type%2Cfield.description&hasOptions=true&page=1&limit=100&raw=1")
Where the token is found within the scope:
token = "eyJpdiI6IjJZMDZNOGYwUDk4dE1OcVc4ekdnUGc9PSIsInZhbHVlIjoib2lYcWtzRi9VN3ovbzdER2NhQlg0KzJQL1ZId2ZOeWpwSTF5YThlclN1SW9YSEtJbG9kR0FLbmRmWmtNcmd1eCIsIm1hYyI6ImU4ODA3YzZkZGUwZjFhNmM1NTE4ZjEzNmZkNThmZDY4ODE1NmM0YTM1Yjc2Y2E2OWVkNjZiZTE3ZDcxOGFlZjMifQ"
However, I do not know if I am placing the token where I should in the URL, but when I ran:
fixture <- jsonlite::read_json(url,simplifyVector = TRUE)
I received the following error:
Error in parse_con(txt, bigint_as_char) :
lexical error: invalid char in json text.
<!doctype html> <html itemscope
(right here) ------^
The token needs to be sent as a request header named x-xsrf-token
not by pass to the parameters:
Also, the token value might change over sessions so you need to get it in the cookie. After that, convert the data to a data frame and get the result:
library(rvest)
pg <- html_session("https://www.barchart.com/options/stocks-by-sector?page=1")
cookies <- pg$response$cookies
token <- URLdecode(dplyr::recode("XSRF-TOKEN", !!!setNames(cookies$value, cookies$name)))
pg <-
pg %>% rvest:::request_GET(
"https://www.barchart.com/proxies/core-api/v1/quotes/get?lists=stocks.optionable.by_sector.all.us&fields=symbol%2CsymbolName%2ClastPrice%2CpriceChange%2CpercentChange%2ChighPrice%2ClowPrice%2Cvolume%2CtradeTime%2CsymbolCode%2CsymbolType%2ChasOptions&orderBy=symbol&orderDir=asc&meta=field.shortName%2Cfield.type%2Cfield.description&hasOptions=true&page=1&limit=1000000&raw=1",
config = httr::add_headers(`x-xsrf-token` = token)
)
data_raw <- httr::content(pg$response)
data <-
purrr::map_dfr(
data_raw$data,
function(x){
as.data.frame(x$raw)
}
)