Search code examples
pythonbrowsersplinter

Put data into a list from webpage (splinter)


I am doing a little bot, that should give information from website (ebay) and put into a list using splinter and python. My first lines of code:

from splinter import Browser
with Browser() as browser:
url = "http://www.ebay.com"
browser.visit(url)
browser.fill('_nkw', 'levis')
button = browser.find_by_id('gh-btn')
button.click()

Ebay.com How I can put information that in red frame to list, using information from web page?

Like : [["Levi Strauss & Co. 513 Slim Straight Jean Ivory Men's SZ", 12.99, 0], ["Levi 501 Jeans Mens Original Levi's Strauss Denim Straight", 71.44, "Now"], ["Levis 501 Button Fly Jeans Shrink To Fit Many Sizes", [$29.99, $39.99]]]


Solution

  • This is not perfect answer, but it should work. first thing install these two module requests and BS4:

    pip install requests

    pip install beautifulsoup4

    import requests
    import json
    from bs4 import BeautifulSoup
    
    #setting up the headers
    headers={
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8',
    'Referer': 'https://www.ebay.com/',
    'Accept-Encoding': 'gzip, deflate, br',
    'Accept-Language': 'en-US,en;q=0.8',
    'Host': 'www.ebay.com',
    'Connection': 'keep-alive',
    'Cache-Control': 'max-age=0',
    }
    #setting up my proxy, you can disable it
    proxy={
    'https':'127.0.0.1:8888'
    }
    
    #search terms
    search_term='armani'
    
    #request session begins
    ses=requests.session()
    
    #first get home page so to set cookies
    resp=ses.get('https://www.ebay.com/',headers=headers,proxies=proxy,verify=False)
    
    #next get the search term page to parse request
    resp=ses.get('https://www.ebay.com/sch/i.html?_from=R40&_trksid=p2374313.m570.l1313.TR12.TRC2.A0.H0.X'+search_term+'.TRS0&_nkw='+search_term+'&_sacat=0',
    headers=headers,proxies=proxy,verify=False)
    
    
    soup = BeautifulSoup(resp.text, 'html.parser')
    items=soup.find_all('a', { "class" : "vip" })
    price_items=soup.find_all('span', { "class" : "amt" })
    
    final_list=list()
    
    for item,price in zip(items,price_items):
        try:
            title=item.getText()
            price_val=price.find('span',{"class":"bold"}).getText()
            final_list.append((title,price_val))
        except Exception as ex:
            pass
        
    print(final_list)
    

    This is the output that I got

    enter image description here