Here is the URL that I'm trying to scrape: https://www.sec.gov/ix?doc=/Archives/edgar/data/320193/000032019319000076/a10-qq320196292019.htm
I'm trying to scrape the webpage using Python which mean I will require the XHR request for this page as it is loaded via JavaScript.
Upon inspection of the Network under Developer Tools, I can see the XHR request: a10-qq320196292019.htm which produces the request URL: https://www.sec.gov/Archives/edgar/data/320193/000032019319000076/a10-qq320196292019.htm
My question is two-fold,
In this case, I don't think you need to go that route. The link you're using is an ixbrl view of the actual html document. The url for the html doc is embedded in that first link. All you have to do is extract it:
url = 'https://www.sec.gov/ix?doc=/Archives/edgar/data/320193/000032019319000076/a10-qq320196292019.htm'
html_url = url.replace('/ix?doc=','')
html_url
Output:
'https://www.sec.gov/Archives/edgar/data/320193/000032019319000076/a10-qq320196292019.htm