Search code examples
pythonparsingweb-crawleryandex

Reverse search an image in Yandex Images using Python


I'm interested in automatizing reverse image search. Yandex in particular is great for busting catfishes, even better than Google Images. So, consider this Python code:

import requests
import webbrowser

try:
    filePath = "C:\\path\\whateverThisIs.png"
    searchUrl = 'https://yandex.ru/images/'
    multipart = {'encoded_image': (filePath, open(filePath, 'rb')), 'image_content': ''}
    response = requests.post(searchUrl, files=multipart, allow_redirects=False)
    #fetchUrl = response.headers['Location']
    print(response)
    print(dir(response))
    print(response.content)
    input()
except Exception as e:
    print(e)
    print(e.with_traceback)
    input()```

The script fails with KeyError, 'location' is not found. I know the code works cause if you substitute searchUrl with http://www.google.hr/searchbyimage/upload then the script returns the correct url. So, in short the expected outcome would be a url with an image search. In actuality we get a KeyError where that url was supposed to be stored. Evidently, Yandex doesn't work in exactly the same way, maybe the url is off (although I tried a heap ton of variations) or the reason may be completely different.

Regardless of that, help in solving this problem is much appreciated!


Solution

  • You can get url with an image search by using this code. Tested on ubuntu 18.04, with python 3.7 and requests 2.23.0

    import json
    
    import requests
    
    file_path = "C:\\path\\whateverThisIs.png"
    search_url = 'https://yandex.ru/images/search'
    files = {'upfile': ('blob', open(file_path, 'rb'), 'image/jpeg')}
    params = {'rpt': 'imageview', 'format': 'json', 'request': '{"blocks":[{"block":"b-page_type_search-by-image__link"}]}'}
    response = requests.post(search_url, params=params, files=files)
    query_string = json.loads(response.content)['blocks'][0]['params']['url']
    img_search_url = search_url + '?' + query_string
    print(img_search_url)