Search code examples

How do I pass a user-agent to panda's pd.read_html()?

some websites automatically decline requests due to lack of user-agent, and it's a hassle using bs4 to scrape many different types of tables.

This issue was resolved before through this code:

url = ''
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
response =
tables = pd.read_html(

However urllib2 has been depreciated and urllib3 doesn't have a build_opener() attribute, and I could not find an equivalent attribute either even though I'm sure it has one.


  • read_html() accepts a URL and string, so u can set headers on request, and pandas ll read this resoponse like a text:

    import pandas as pd
    import requests
    url = ''
    response = requests.get(url, headers={'User-agent': 'Mozilla/5.0'})
    tables = pd.read_html(response.text)

    If u open read_html() none of the options accept headers as an argument, so just set headers in request