I have signed in a website using R 3.5.2, and this seems to be gone well both using rvest_0.3.4 and httr_1.4.0, but then I get stuck into a redirecting page which, on the browser (Chrome), is visualized only for a few secs after I hit the button "Login!".
The problematic step seems to be a form method="post" input type="hidden"
which I don't manage to submit from R.
signin <- "https://www.cdp.net/en/users/sign_in"
library(rvest)
user.email <- "my_email"
user.password <- "my_password"
signin.session <- html_session(signin)
signin.form <- html_form(signin.session)[[1]]
filled.signin <- set_values(signin.form,
`user[email]` = user.email,
`user[password]` = user.password)
signed.in <- submit_form(signin.session, filled.signin)
read_html(signed.in) %>% html_node("form")
library(httr)
login <- list(
`user[email]` = "my_email",
`user[password]` = "my_password",
submit = "Login!")
signed.in.post <- POST(signin, body = login, encode = "form", verbose())
http_status(signed.in.post)
content(signed.in.post, as = "parsed")
read_html(signed.in.post$url) %>% html_node("form")
My goal is to access my account and browse the website, but I don't know how to go through the redirecting page from R.
SOLVED!
It was a quite easy and intuitive solution, I just needed to submit the form method="post" input type="hidden"
of the redirecting page, i.e. the one encountered in the signed.in
session.
I solved it with rvest
but I think that httr
would be equally easy, here comes the code I used:
library(rvest)
signin.session <- html_session(signin)
signin.form <- html_form(signin.session)[[1]]
filled.signin <- set_values(signin.form,
`user[email]` = user.email,
`user[password]` = user.password)
signed.in <- submit_form(signin.session, filled.signin)
redirect.form <- html_form(signed.in)[[1]]
redirected <- submit_form(signed.in, redirect.form)
This last object redirected
is a session-class object
, basically the page which can be normally browsed after signing in the website.
In case someone has a shorter, more effective, more elegant/sexy/charming solution to proceed...please don't hesitate to share it.
I'm an absolute beginner of web-scraping, and I am keen to learn more about these operations!
THX