Search code examples
pythonbeautifulsouppython-requestsweb-mining

Scraping data from a dynamic ecommerce webpage


I'm trying to scrap the titles of all the products listed on a webpage of an E-Commerce site(in this case, Flipkart). Now, the products that I would be scraping would depend of the keyword entered by the user. A typical URL generated if I entered a product 'XYZXYZ' would be:

http://www.flipkart.com/search?q=XYXXYZ&as=off&as-show=on&otracker=start 

Now, using this link as a template, I wrote the following script to scrap the titles of all the products listed under any given webpage based on the keyword entered:

import requests
from bs4 import BeautifulSoup

def flipp(k):
    url = "http://www.flipkart.com/search?q=" + str(k) + "&as=off&as-show=on&otracker=start"
    ss = requests.get(url)
    src = ss.text
    obj = BeautifulSoup(src)
    for e in obj.findAll("a", {'class' : 'lu-title'}):
        title = e.string
        print unicode(title)

h = raw_input("Enter a keyword:")
print flipp(h)

However, the above script returns None as the output. When I tried to debug at each step, I found that the requests module is unable to get the source code of the webpage. What seems to be happening over here?


Solution

  • This does the trick,

    import requests
    from bs4 import BeautifulSoup
    import re
    
    def flipp(k):
        url = "http://www.flipkart.com/search?q=" + str(k) + "&as=off&as-show=on&otracker=start"
        ss = requests.get(url)
        src = ss.text
        obj = BeautifulSoup(src)
        for e in obj.findAll("a",class_=re.compile("-title")):
            title = e.text
            print title.strip()
    
    h = raw_input("Enter a keyword:") # I used 'Python' here
    print flipp(h)
    
    Out[1]:
    Think Python (English) (Paperback)
    Learning Python (English) 5th  Edition (Hardcover)
    Python in Easy Steps : Makes Programming Fun ! (English) 1st Edition (Paperback)
    Python : The Complete Reference (English) (Paperback)
    Natural Language Processing with Python (English) 1st Edition (Paperback)
    Head First Programming: A learner's guide to programming using the Python language (English) 1st  Edition (Paperback)
    Beginning Python (English) (Paperback)
    Programming Python (English) 4Th Edition (Hardcover)
    Computer Science with Python Language Made Simple - (Class XI) (English) (Paperback)
    HEAD FIRST PYTHON (English) (Paperback)
    Raspberry Pi User Guide (English) (Paperback)
    Core Python Applications Programming (English) 3rd  Edition (Paperback)
    Write Your First Program (English) (Paperback)
    Programming Computer Vision with Python (English) 1st Edition (Paperback)
    An Introduction to Python (English) (Paperback)
    Fundamentals of Python: Data Structures (English) (Paperback)
    Think Complexity (English) (Paperback)
    Foundations of Python Network Programming: The comprehensive guide to building network applications with Python (English) 2nd Edition (Soft Cover)
    Python Programming for the Absolute Beginner (English) (Paperback)
    EXPERT PYTHON PROGRAMMING BEST PRACTICES FOR DESIGNING,CODING & DISTRIBUTING YOUR PYTHON 1st Edition (Paperback)
    None