I am trying to scrape the window.PRELOAEDED_STATE from the following url using requests.json, I cant isolate the element I want so that i can use the json function on it.
I tried the below code first.
response = requests.get(https://www.racingpost.com/profile/horse/431262/ready-for-action-ii)
I successfully got a response from the server and when viewing the text that the request produces I can see the data I would like in the HTML but I cant single it down to the window.PRELOADED_STATE element that I want. Once I have that element I want to use .json() on it in order to get the data into a dictionary
Use a regular expression to extract everything on the line between window.PRELOADED_STATE =
and the final ;
.
import re, requests, json
response = requests.get('https://www.racingpost.com/profile/horse/431262/ready-for-action-ii')
state_match = re.search(r'window.PRELOADED_STATE\s*=\s(.*);', response.text)
if state_match:
preloaded_state = json.loads(state_match.group(1))