python web-scraping beautifulsoup nonetype

.text issue when scraping Google Flgihts

Here's my code:

from bs4 import BeautifulSoup
import requests
import time

html_text = requests.get('https://www.google.com/travel/flights/search?tfs=CBwQAhoeagcIARIDSkZLEgoyMDIzLTA3LTAzcgcIARIDQU1TGh5qBwgBEgNBTVMSCjIwMjMtMDctMTNyBwgBEgNKRktwAYIBCwj___________8BQAFIAZgBAQ').text
soup = BeautifulSoup(html_text, 'lxml')
flights = soup.find_all('li', class_ = 'pIav2d')
for flight in flights:
    czas = flight.find('span', class_ = 'mv1WYe').text
    stops = flight.find('div', class_ = 'EfT7Ae AdWm1c tPgKwe').text
    cheapP = flight.find('div', class_ = 'YMlIz FpEdX jLMuyc').text
    Reg_price = flight.find('div', class_ = 'YMlIz FpEdX').text

    print(f'''
    Time: {czas}
    Stops: {stops}
    Cheapest: {cheapP}
    Regular Price: {Reg_price}
    ''')

The problem is Reg_price = flight.find('div', class_ = 'YMlIz FpEdX').text. When I add on the end .text I get error: 'NoneType' object has no attribute 'text'

And I know is the correct identifier because I'm using Xpath helper and when I run //div[@class='YMlIz FpEdX'] in Xpath helper, I get the correct results.

Edit: I figured out what is the problem and I need to write some sort of condition or if statement. And I need help with that.

Basically, flights = soup.find_all('li', class_ = 'pIav2d') sometimes has cheapP and Reg_price variables in it and sometimes doesn't, so I need to specify in the code that if it doesn't have that value just print None and still run the code till the end. What should my if statement look like?

Solution

In fact, you don't always have a value in the 'Reg_price' field yet. You need to process these two values in the try, except blocks as follows:

from bs4 import BeautifulSoup
import requests
import time

html_text = requests.get('https://www.google.com/travel/flights/search?tfs=CBwQAhoeagcIARIDSkZLEgoyMDIzLTA3LTAzcgcIARIDQU1TGh5qBwgBEgNBTVMSCjIwMjMtMDctMTNyBwgBEgNKRktwAYIBCwj___________8BQAFIAZgBAQ').text
soup = BeautifulSoup(html_text, 'lxml')
flights = soup.find_all('li', class_ = 'pIav2d')
for flight in flights:
    czas = flight.find('span', class_ = 'mv1WYe').text
    stops = flight.find('div', class_ = 'EfT7Ae AdWm1c tPgKwe').text
    try:
        cheapP = flight.find('div', class_ = 'YMlIz FpEdX jLMuyc').text
    except AttributeError:
        cheapP = None
    try:
        Reg_price = flight.find('div', class_ = 'YMlIz FpEdX').text
    except AttributeError:
        Reg_price = None

    print(f'''
    Time: {czas}
    Stops: {stops}
    Cheapest: {cheapP}
    Regular Price: {Reg_price}
    ''')