Search code examples
pythonwebweb-crawlerurllib2

Best package for scraping HTML with Python from site that needs cookies enabled


I am currently using Python 3.6 to crawl a file of URLs and search for a certain string. After running the script, the HTML returned indicated that cookies needed to be enabled. Mechanize and every other library I find is not supported for any versions of python 3.x. Can someone point me in the right direction of libraries that can enable cookies so the correct HTML is rendered?


Solution

  • You can both retrieve and send cookies with the awesome requests package.

    Sending cookies:

    cookies = {
        cookies_are:'working'
        }
    
    requests.get(url, cookies=cookies)
    

    Retrieving cookies:

    r = requests.get(url)
    r.cookies #return a dictionary
    

    More information check out the requests documentation. Hope it helps!