I am trying to use the rvest package to scrape the release calendar from this website https://www.cso.ie/en/csolatestnews/releasecalendar/
By default the dates show up for the coming 7 days which is just what im looking for however when I use the read_html function from rvest it doesn't appear to pickup the defaults as text and as a result I'm finding it difficult to extract the information.
Any help here would be great.
library(rvest)
library(dplyr)
library(xml2)
url <- read_html("https://www.cso.ie/en/csolatestnews/releasecalendar")
test<-url %>% html_nodes('td')
library(tidyverse)
library(httr2)
"https://cdn.cso.ie/static/data/ReleaseCalendar.json" %>%
request() %>%
req_perform() %>%
resp_body_json(simplifyVector = TRUE) %>%
pluck("releases") %>%
as_tibble()
# A tibble: 220 × 10
dateindex releasedate dayname title refpe…¹ status sector subse…² subse…³ comment
<int> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
1 5325 20/03/2023 Monday "Fuel Excise… Januar… Confi… Envir… Energy https:… ""
2 5185 21/03/2023 Tuesday "Transport B… March … Confi… Busin… Transp… http:/… ""
3 5066 22/03/2023 Wednesday "Wholesale P… Februa… Confi… Econo… Prices http:/… ""
4 5475 22/03/2023 Wednesday "Environment… 2020 Confi… Envir… Enviro… http:/… ""
5 5475 22/03/2023 Wednesday "Environment… 2020 Confi… Envir… Enviro… https:… ""
6 5476 24/03/2023 Friday "COVID-19 Va… Series… Confi… Peopl… Health http:/… ""
7 5150 24/03/2023 Friday "Vital Stati… Quarte… Confi… Peopl… Births… http:/… ""
8 5151 28/03/2023 Tuesday "Livestock S… Februa… Confi… Busin… Agricu… http:/… ""
9 5068 28/03/2023 Tuesday "Retail Sale… Februa… Confi… Busin… Servic… http:/… ""
10 5152 29/03/2023 Wednesday "Crops and L… 2022 Confi… Busin… Agricu… http:/… ""
# … with 210 more rows, and abbreviated variable names ¹refperiod, ²subsector, ³subsectorURL
# ℹ Use `print(n = ...)` to see more rows