Search code examples
beautifulsoupscrapyweb-crawlermechanize

python scraping error AttributeError: 'NoneType' object has no attribute 'text'


I am doing python scraping with beautiful soup, the website i am crawling has a 28 container with title, link and text, the text is in <p>tag, my problem is I can crawl all the data but some <p> tags has no text, so I receive an error "AttributeError: 'NoneType' object has no attribute 'text'" my code is:

containers = page_soup.findAll("div", {"class":"item-container"})


for contain in containers:


    title = contain.div.a.h3.text

    print("title: "+title)

    link = contain.div.a["href"]

    print("source: "+link)

    des = contain.div.p.text
   
    print("Description: "+des)

it print 5 times <p> tag text, because not all of the <p> tag has text, but it gives me error, how to resolve this?


Solution

  • You can try like below to accomplish the task:

    for contain in page_soup.find_all("div", {"class":"item-container"}):
        title = contain.div.a.h3.text
        link = contain.div.a["href"]
        try:
            des = contain.div.p.text
        except:
            des = ""
        print("title: {}\nlink: {}\ndescription: {}\n".format(title,link,des))