Search code examples
pythonimageweb-scraping

Python: the right URL to download pictures from Google Image Search


I'm trying do obtain images from Google Image search for a specific query. But the page I download is without pictures and it redirects me to Google's original one. Here's my code:

AGENT_ID   = "Mozilla/5.0 (X11; Linux x86_64; rv:7.0.1) Gecko/20100101 Firefox/7.0.1"

GOOGLE_URL = "https://www.google.com/images?source=hp&q={0}"

_myGooglePage = ""

def scrape(self, theQuery) :
    self._myGooglePage = subprocess.check_output(["curl", "-L", "-A", self.AGENT_ID, self.GOOGLE_URL.format(urllib.quote(theQuery))], stderr=subprocess.STDOUT)
    print self.GOOGLE_URL.format(urllib.quote(theQuery))
    print self._myGooglePage
    f = open('./../../googleimages.html', 'w')
    f.write(self._myGooglePage)

What am I doing wrong?

Thanks


Solution

  • I'll give you a hint ... start here:

    https://ajax.googleapis.com/ajax/services/search/images?v=1.0&q=JULIE%20NEWMAR

    Where JULIE and NEWMAR are your search terms.

    That will return the json data you need ... you'll need to parse that using json.load or simplejson.load to get back a dict ... followed by diving into it to find first the responseData, then the results list which contains the individual items whose url you will then want to download.

    Though I don't suggest in any way doing automated scraping of Google, since their (deprecated) API for this specifically says not to.