Search code examples
pythonwebpython-requestsscreen-scrapinganalysis

Python : Scrape Game Names


I Am Having trouble Scraping names of games off a web page.. It is returning a blank array.. Once the name is scraped i want it to be written to a newly created Text file.. My Code should be below.. its nowhere near complete but im sure i will need a While condition..

def ScrapeK10():
siteToScrape = 'http://www.kiz10.com/new-games'
print '\n[!] Requesting Kiz10..'
kizReq = requests.get(siteToScrape)
print '\n[!] Scraping Newest Games...'
kizTree - html.fromstring(kizReq.content)
kizElement = kizTree.xpath('//strong[@class="bx-caption"]/text()')
print 'Latest Games : ', kizElement, '\n'
return

The problems im running into is im getting a blank array so im not sure if im actually scraping the site correctly or even using the correct xpath?

Still a little new to this.. Dont want to use Beautiful Soup nor do i want to use Scapy..

But my Goal is to scrape all games names in the web page i gave, And write them to a new file..


Solution

  • Can you use regex? Notice that all the game names are contained in a JavaScript object named 'itemsGame'.

    Use regex to filter this out, then use regex again to split each line.

    This should do it

    def main():
        import re
        import requests
        url = "http://kiz10.com/index.php?page=newgames"
        raw = requests.get(url).content
        match = re.search("var itemsGame = \[(.*?)\];$", raw, re.M)
        for line in re.findall('\[(.*?)\]', match.group(1)):
            print(line.replace("'", "").split(",")[3].strip())
    

    Alternatively you could just call eval() on the string from var itemsGame = to the next \n character.

    Obviously though, eval is always dangerous and never really recommended