Search code examples
pythonfindallgetattr

Failing to get attribute value in Python


I'm trying to code a scraper for a website and so far I was able to scrape the general information I need but the specific attribute value I am trying to obtain from that information is returning with none even though there are clearly values there. It all works fine up until I try using getattr of each container in containers to find the values for data-id. Maybe there's a better way to do this but I'm having a hard time understanding why it's not able to find it.

This is what my code looks like.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup as soup
from selenium.webdriver.common.action_chains import ActionChains

url = "http://csgo.exchange/id/76561197999004010#x"

driver = webdriver.Firefox()

driver.get(url)
import time
time.sleep(10)
html = driver.page_source
soup = soup(html, "html.parser")


containers = soup.findAll("div",{"class":"vItem"})
print(len(containers))

for container in containers:
    test = getattr(container, "data-id")

    print(str(test))


with open('scraped.txt', 'w', encoding="utf-8") as file:
    file.write(str(containers))

Here's a sample of what each container looks like.

div class="vItem Normal Container cItem" data-bestquality="0" data-category="Normal" data-collection="The Spectrum Collection" data-currency="0" data-custom="" data-exterior="" data-hashname="Spectrum%20Case" data-id="15631554103"


Solution

  • Just change the line with getattr() to container.attrs["data-id"]. That works for me. But the 10 seconds time sleeps wasn't enough for me in most attempts.

    from bs4 import BeautifulSoup as soup
    from selenium.webdriver.common.action_chains import ActionChains
    
    url = "http://csgo.exchange/id/76561197999004010#x"
    
    driver = webdriver.Firefox()
    
    driver.get(url)
    import time
    time.sleep(10)
    html = driver.page_source
    soup = soup(html, "html.parser")
    
    
    containers = soup.findAll("div",{"class":"vItem"})
    print(len(containers))
    data_ids = [] # Make a list to hold the data-id's
    
    for container in containers:
        test = container.attrs["data-id"]
        data_ids.append(test) # add data-id to the list
    
        print(str(test))
    
    
    with open('scraped.txt', 'w', encoding="utf-8") as file:
        for id in data_ids:
            file.write(str(id)+'\n') # write every data-id to a new line.