Given this URL requires the following login data:
Benutzername oder E-Mail -> User: testuserscrap@web.de
Passwort -> Password: testuserscrap
(The website is kind of fantasy football of the German Bundesliga.)
There exists a post where someone asks for help for the same website.
However, I do not want to retrieve information about certain players but about the actual team. In the browser, these steps are required:
Click on the red circled icon:
Leads to this page where I would like to retrieve all the names (of the players) in list 1 and 2:
Means I would like to have an output such as:
Diego Contento
Alfred Finnbogason
...
I am not sure which way might be the best one. According to the referred posts there seem to be an API. However, I cannot access the information with the code adapted from the referred post:
library(rvest)
library(jsonlite)
library(httr)
library(plyr)
library(dplyr)
url<-"https://kickbase.sky.de/"
page<-html_session(url)
page<-rvest:::request_POST(page,url="https://kickbase.sky.de/api/v1/user/login",
body=list("email"="testuserscrap@web.de",
"password"="testuserscrap",
"redirect_url"="https://www.kickbase.com/transfermarkt/kader"),
encode='json'
)
ck <- cookies(page)
player_page<-jump_to(ck$value,"https://api.kickbase.com/leagues/1420282/lineupex")
Unfortunately, I'm not such an expert in coding or webscraping. I tried many things but I do not come to a solution :/ Therefore, I would be really grateful if you have any advice or idea how I can retrieve the information.
Best :)
Wow, this was a tough question, but a very good learning experience for me. To solve this one I used the "curlconvertor" package, download available from GitHub using devtools package. See https://github.com/hrbrmstr/curlconverter, and other questions/answers posted here at stack overflow.
First login into the web page using your browser and navigate to the page in interest. Using the developer tools copy the 'cURL' address from the file of intereste. The cURL can be stripped of the nonessential parts, but I would need to determine the noncritical parts through trial and error.
Then use the straighten
function, edit the userid and password (these were not saved with the cURL address), make the request, and then parse the return.
#cURL copied from network tab for the requested file
xcurl<-"curl 'https://api.kickbase.com/leagues/1420282/lineupex'
-XGET
-H 'Accept: */*'
-H 'Origin: https://kickbase.com'
-H 'Referer: https://kickbase.com/transfermarkt/kader'
-H 'Accept-Language: en-us'
-H 'Host: api.kickbase.com'
-H 'Authorization: Bearer XU3DGDZBxlHB0sjqG01yLhHihT2AacPeIeWOlY+u3nxz/iokfCjn8a9vaKeKFXwxJpcH/0FXOgGg3J2EfmUUDJ9uwjT+oxHZTGc1EuOxbG0i66fRBBm1RBT0Yd4ACRDQ9BCs8yb+/w9+gOPIyhM2Vio3DZemExATq22osCGeW6VzYmos/3F8MTDbKOAk8NPKQYr5xPSght26ayZ4/X21ag==' \
-H 'Accept-Encoding: br, gzip, deflate'
-H 'Connection: keep-alive'"
#See https://github.com/hrbrmstr/curlconverter, install from devtools
library(curlconverter)
library(dplyr)
my_ip<-straighten(xcurl)
#add password and user id
my_ip[[1]]$password<-"testuserscrap"
my_ip[[1]]$username<-"testuserscrap@web.de"
#Make page request
respone<-my_ip %>% make_req()
#retrieve the entire file
#jsonfile<-jsonlite::toJSON(content(respone[[1]](), as="parsed"), auto_unbox = TRUE, pretty=TRUE)
#retrieve only the player info from file and convert to data frame
dfs <- lapply(content(respone[[1]](), as="parsed")$players, data.frame)
#not every player has the same information thus bind_rows instead of rbind
players <- do.call(bind_rows, dfs)
players