python web-scraping beautifulsoup html-parsing python-requests-html

What does 'AttributeError: 'NoneType' object has no attribute 'find_all'' mean in this code?

I am building a quite simple beautifulsoup/requests web scraper, but when running it on a jobs website, the error

AttributeError: 'NoneType' object has no attribute 'find_all'

appears. Here is my code:

import requests
from bs4 import BeautifulSoup

URL = "https://uk.indeed.com/jobs?q&l=Norwich%2C%20Norfolk&vjk=139a4549fe3cc48b"
page = requests.get(URL)

soup = BeautifulSoup(page.content, "html.parser")

results = soup.find(id="ResultsContainer")

job_elements = results.find_all("div", class_="resultContent")

python_jobs = results.find_all("h2", string="Python")

for job_element in job_elements:
    title_element = job_element.find("h2", class_="jobTitle")
    company_element = job_element.find("span", class_="companyName")
    location_element = job_element.find("div", class_="companyLocation")
    print(title_element)
    print(company_element)
    print(location_element)
    print()

Does anyone know what the issue is?

Solution

Check your selector for results attribute id should be resultsBody. The wrong selector causes the error in lines that uses results, cause None do not has attributes:

results = soup.find(id="resultsBody")

and also job_elements it is an td not a div:

job_elements = results.find_all("td", class_="resultContent")

You could also chain the selectors with css selectors:

job_elements = soup.select('#resultsBody td.resultContent')

Getting only these that contains Python:

job_elements = soup.select('#resultsBody td.resultContent:has(h2:-soup-contains("Python"))')

Example

import requests
from bs4 import BeautifulSoup

URL = "https://uk.indeed.com/jobs?q&l=Norwich%2C%20Norfolk&vjk=139a4549fe3cc48b"
page = requests.get(URL)

soup = BeautifulSoup(page.content, "html.parser")

results = soup.find(id="resultsBody")

job_elements = results.find_all("td", class_="resultContent")

python_jobs = results.find_all("h2", string="Python")

for job_element in job_elements:
    title_element = job_element.find("h2", class_="jobTitle")
    company_element = job_element.find("span", class_="companyName")
    location_element = job_element.find("div", class_="companyLocation")
    print(title_element)
    print(company_element)
    print(location_element)
    print()