Search code examples
pythonhtmlpython-3.xweb-scrapingnonetype

Why do I get AttributeError: 'NoneType' object has no attribute 'attrs'?


import requests
import bs4
import csv
from itertools import zip_longest

laptop = []
laptops_price = []
links = []

url = "https://www.jumia.com.eg/ar/catalog/?q=%D9%84%D8%A7%D8%A8%D8%AA%D9%88%D8%A8"
page = requests.get("https://www.jumia.com.eg/ar/catalog/?q=%D9%84%D8%A7%D8%A8%D8%AA%D9%88%D8%A8")
bs = bs4.BeautifulSoup(page.content, 'html.parser')
laptops = bs.find_all('h3')
laptops_prices = bs.find_all("div", {"class": "prc"})
for l in range(len(laptops)):
    laptop.append(laptops[l].text)
    links.append(laptops[l].find("a", {"class" : "core"}).attrs['href'])
    laptops_price.append(laptops_prices[l].text)


laptops_list = [laptop, laptops_price, links]
exported = zip_longest(*laptops_list)
with open(r"C:\Users\Administrator\Desktop\jumiawep.csv", "w", encoding="utf-8") as jumialaptops:
    write = csv.writer(jumialaptops)
    write.writerow(["Laptop", "Price", "Links"])
    write.writerows(exported)
raceback (most recent call last):
  File "C:\Users\Administrator\PycharmProjects\pythonProject\main.py", line 17, in <module>
    links.append(laptops[l].find("a").attrs['href'])
AttributeError: 'NoneType' object has no attribute 'attrs'

I tried to get a list of links problem when I was scraping but i get this error.


Solution

  • There are different things in my opinion:

    • website is protected by cloudflare, I am not able to request it from my location

    Cloudflare is a global network designed to make everything you connect to the Internet secure, private, fast, and reliable. Secure your websites, APIs, and Internet applications. Protect corporate networks, employees, and devices. Write and deploy code that runs on the network edge.

    • <h3> do not have a child <a> that you try to find(), instead <h3> is a child of <a>

    • avoid the bunch of lists and process your scraping in one go.

    Example

    If you are not blocked by cloudflare and content is not rendered dynamically by javascript this should give you the expected result.

    import requests, csv
    from bs4 import BeautifulSoup
    
    url = "https://www.jumia.com.eg/ar/catalog/?q=%D9%84%D8%A7%D8%A8%D8%AA%D9%88%D8%A8"
    soup = BeautifulSoup(requests.get(url).content)
    
    with open(r"jumiawep.csv", "w", encoding="utf-8") as jumialaptops:
        write = csv.writer(jumialaptops)
        write.writerow(["Laptop", "Price", "Links"])
    
        for e in soup.select('article'):
            write.writerow([
                e.h3.text,
                e.select_one('.prc').text,
                e.a.get('href')
            ])