I would like to scrape information from a webpage. There is a login screen, and when I am logged in, I can access all kinds off pages from which I would like to scrape information (such as the last name of a player, the object .lastName
).
I am using R and the packages rvest
and httr
.
Somehow, the login seems to work, but I am clueless how to be redirected to the page I need to get the info from.
The login form can be accessed on http://kickbase.sky.de/anmelden
and the relevant pages have the form http://kickbase.sky.de/spielerprofil/player-name/number
, e.g. http://kickbase.sky.de/spielerprofil/nadiem-amiri/1639#
.
Here is the code I used. Thank you very much for your help.
install.packages("rvest")
install.packages("httr")
library(rvest)
library(httr)
handle <- handle("http://kickbase.sky.de") # Create handle
path <- "anmelden" # Login Path
# fields found in the login form.
login <- list(
email = "[email protected]"
,password = "tester"
,redirect_url = # I want to be redirected to this page and then scrape info from here
"http://kickbase.sky.de/spielerprofil/nadiem-amiri/1639#"
)
response <- POST(handle = handle, path = path, body = login)
webpage <- read_html(response)
name_data <- html_text(html_nodes(webpage, ".lastName"))
name_data
library(rvest)
url<-"https://kickbase.sky.de/"
page<-html_session(url)
page<-rvest:::request_POST(page,url="https://kickbase.sky.de/api/v1/user/login",
body=list("email"="[email protected]",
"password"="tester",
"redirect_url"="http://kickbase.sky.de/spielerprofil/nadiem-amiri/1639#"),
encode='json'
)
player_page<-jump_to(page,"https://kickbase.sky.de/api/v1/news?skip=0&player=1639&limit=3")
data<-jsonlite::fromJSON(readBin(player_page$response$content,what="json"))
print(data)
Please note that the website provides an API and that is where you get the data
https://kickbase.sky.de/api/v1/news?skip=0&player=1639&limit=3
variable data
has all the information needed