I'm trying to scrape the data on all gym locations from https://www.xercise4less.co.uk/find-a-gym/.
In Developer Tools I found a pointer to the Web API URL that should store this information under https://www.xercise4less.co.uk/Umbraco/Api/FindAGymApi/GetAll but when I run it in the browser I get
The 'ObjectContent`1' type failed to serialize the response body for content type 'text/xml; charset=utf-8'
Similarly, if I run the following code:
# user_agent argument is optional here and results are the same whether I include it or not
httr::GET('https://www.xercise4less.co.uk/Umbraco/Api/FindAGymApi/GetAll', httr::user_agent("httr"))
Any ideas on how to go about this?
Alternatively, I can (almost) access all the gym IDs by
url <- "https://www.xercise4less.co.uk/find-a-gym/"
my_pg <- read_html(url)
my_pg %>% html_nodes('select > option')
But then I'm still not sure about how to iterate over all the IDs in order to get the complete list of coordinates/locations. Thanks for any pointers.
You are pretty much there you just need to set the right request header expected by server then you get all the info for all the gyms.
headers = c('Accept'='application/json, text/javascript, */*; q=0.01')
r <- content(httr::GET(url = 'https://www.xercise4less.co.uk/Umbraco/Api/FindAGymApi/GetAll', httr::add_headers(.headers=headers)))