python selenium-webdriver web-scraping beautifulsoup css-selectors

How to find HTML elements by multiple tags with selenium

I need to scrape data from a webpage with selenium. I need to find these elements:

<div class="content-left">
    <ul></ul>
    <ul></ul>
    <p></p>
    <ul></ul>
    <p></p>
    <ul></ul>
    <p></p>
    <ul>
        <li></li>
        <li></li>
    </ul>
    <p></p>
</div>

As you can see <p> and <ul> tags has no classes and I don't know how to get them in order.

I used Beautifulsoup before:

allP = bs.find('div', attrs={"class":"content-left"})
txt = ""
for p in allP.find_all(['p', 'li']):

But It's not working anymore (got 403 error by requests). And I need to find these elements with selenium.

HTML:

This image

Solution

To extract the texts from <p> and <li> tags only you can use Beautiful Soup as follows:

from bs4 import BeautifulSoup

html_text = '''
<div class="content-left">
    <ul>1</ul>
    <ul>2</ul>
    <p>3</p>
    <ul>4</ul>
    <p>5</p>
    <ul>6</ul>
    <p>7</p>
    <ul>
        <li>8</li>
        <li>9</li>
    </ul>
    <p>10</p>
</div>
'''
soup = BeautifulSoup(html_text, 'html.parser')
parent_element = soup.find("div", {"class": "content-left"})
for element in parent_element.find_all(['p', 'li']):
    print(element.text)

Console output:

Using Selenium

Using Selenium you can use list comprehension as follows:

Using CSS_SELECTOR:

print([my_elem.text for my_elem in driver.find_elements(By.CSS_SELECTOR, "div.content-left p, div.content-left li")])