I am trying to write an automated PHP script to scrape and extract all 'Job Titles' (Primary Care Physician - Tidewater Market, Primary Care Physician - Richmond Market etc.) from URL https://chenmed.wd1.myworkdayjobs.com/en-US/jencare/
However, this does not seem to be straightforward because the required data is not directly visible in the source code of the webpage. I also tried inspecting 'Developer Tools->Network' of different browsers, however could not locate the source of the data.
Any help would be highly appreciated.
Thanks & Regards!
Looking at the requests made by the website one notices an XHR request that contains the data you care about:
However visiting that URL in a browser gives the same result as navigating to https://chenmed.wd1.myworkdayjobs.com/en-US/jencare/. Investigating further by looking at the request headers
one notices the Accept:application/json,application/xml
(which signifies that the client expect a json or xml document). Indeed it turns out to be true that requesting https://chenmed.wd1.myworkdayjobs.com/en-US/jencare/ with this additional header returns the desired data:
>>> import urllib.request
>>> req = urllib.request.Request('https://chenmed.wd1.myworkdayjobs.com/en-US/jencare/')
>>> req.add_header('Accept', 'application/json,application/xml')
>>> urllib.request.urlopen(req).read().decode('utf-8').find('Primary Care Physician ') > 0
True
Therefore in PHP you probably want to do the following steps:
Accept:application/json,application/xml
(see e.g. How do I send a GET request with a header from PHP?)