I am attempting to webscrape the fixture list from this website
https://www.nrl.com/draw/?competition=111&round=1&season=2024
The output should be
Sea eagles, Rabbitohs
Roosters, Broncos
Knights, Raiders etc
I have written up the following code
url <- "https://www.nrl.com/draw/?competition=111&round=1&season=2024"
page <- read_html(url)
contentnodes <- page %>% html_nodes ("div.u-spacing-mt-24.pre-quench") %>%
html_attr("q-data") %>% jsonlite::fromJSON()
but I am getting the following error:
lexical error: invalid char in json text NA
Reading online some suggest the data is HTML rather than JSON but I have webscraped a different page on the same website with similar code so not entirely sure what has gone wrong here?
library(tidyverse)
library(httr2)
"https://www.nrl.com/draw//data?competition=111&season=2024" %>%
request() %>%
req_perform() %>%
resp_body_json(simplifyVector = T) %>%
pluck("fixtures") %>%
unnest(c(homeTeam, awayTeam), names_sep = "_") %>%
select(contains("nickName"),
contains("odds"))
# A tibble: 8 × 4
homeTeam_nickName awayTeam_nickName homeTeam_odds awayTeam_odds
<chr> <chr> <chr> <chr>
1 Sea Eagles Rabbitohs 2.17 1.69
2 Roosters Broncos 2.51 1.53
3 Knights Raiders 1.42 2.87
4 Warriors Sharks 1.60 2.34
5 Storm Panthers 2.24 1.65
6 Eels Bulldogs 1.47 2.70
7 Titans Dragons 1.49 2.64
8 Dolphins Cowboys 2.67 1.48