I am trying to download data from a webpage (https://www.portaldaindustria.com.br/estatisticas/indicadores-industriais/) in R, which requires button-clicking to accomplish that.
Basically, I just need to click in the Download
button next to "Série Recente".
I tried to get the url from the xlsx file and use download.file()
, but it changes every month.
library(httr)
res <- POST("https://www.portaldaindustria.com.br/estatisticas/indicadores-industriais/",
body = list(`DOWNLOAD` = ""),
encode = "form",
write_disk("Ind.xlsx"))
When I try this it returns an empty Excel file. What am I doing wrong?
Why not simply parse the source page and gather the required url dynamically? You can use an attribute = value css selector for this. You target the href by it containing the substring serie-recente
. You can then pass that url to download.file()
.
library(rvest)
url <- "https://www.portaldaindustria.com.br/estatisticas/indicadores-industriais/"
download_folder <- "<your_download_folder_path>/"
download_link <- read_html(url) |>
html_element("[href*=serie-recente]") |>
html_attr("href")
download.file(download_link,
paste0(download_folder, gsub('.*/', '', download_link))
, mode = "wb")