Search code examples
pythonseleniumiterationscreen-scraping

How to loop through the body of a website using Python and Selenium


first off, my python knowledge is extremely rudimentary, so apologies if what i'm asking is really stupid, but here goes.

I'm trying to use selenium, to read through boards (particularly /biz/'s catalog on 4chan) to track keywords for projects I'm invested in and notify me when there's a thread discussing one of my projects.

so far I've managed to open the page and locate the elements I want to search through, using:

from selenium import webdriver

PATH = "C:\Program Files (x86)\chromedriver.exe"
driver  = webdriver.Chrome(PATH)

driver.get('https://boards.4channel.org/biz/catalog')

threads = driver.find_element_by_id('threads').text

print(threads)
driver.quit()

this successfully prints out all of the threads as text, but now I want to iterate through them and only return the lines that contain the keywords "NFY" and "CORX". I've been testing with the keyword "DOGE", because mine are rarely mentioned. What's the best way to iterate through this text and only return the lines containing my keywords?


Solution

  • If you want to return the thread, this should work.

    threads = driver.find_elements_by_xpath("Path to individual threads")
    
    searchText = ["DOGE", "NFY", "CORX"]
    
    for t in searchText.lower():
        for i in range(len(threads)):
            if t in threads[i].text.lower():
                print(f"Thread: {threads[i].text}")