Search code examples
imagewebautomation

How to auto pick pictures from wikimedia representing an item from arbitrary words?


I want my site to automatically load pictures from wikimedia (or any other no-license site) to represent food items (on my nutrition site*)
I haven't done anything like this before and was wondering if anyone has any experience with this.

For example, the string "lentils, pink, cooked, with salt" would generate a picture of cooked pink lentils. And if that not exist just some pink lentils. And if that not exist just any lentils.

Do I just try any combination for "pink" "lentils" and "cooked" or any synonyms from it? Then how? Do I use an API? Is there a better site than wikimedia? Is there already some sort algorithm for this? Or any tips at all?

Of course I am gonna let users know the picture is a guess and let them even approve if this is the right one. It doesn't have to be perfect and can grow. I am doing this with js and store results with PHP/MySQL, but I think the question is not language dependent.


Solution

  • use the Wikimedia Commons API for searching and retrieving images based on input string. Follow these steps:

    • Parse input string: Extract relevant keywords from the input string. In your example, keywords would be "lentils," "pink," "cooked," and "with salt."

    • Construct API query: Use extracted keywords to construct a query for Wikimedia Commons API. Here's an example using Python:

    import requests
    
    def build_api_query(keywords):
        query = " ".join(keywords)
        return f'https://commons.wikimedia.org/w/api.php?action=query&list=search&format=json&srsearch={query}&srnamespace=6&srlimit=10&origin=*'
    
    • Fetch and display image: Use the API query to fetch results and display image on your website. Example with Python's requests:
    def fetch_image(keywords):
        api_url = build_api_query(keywords)
        response = requests.get(api_url)
        data = response.json()
    
        if len(data['query']['search']) > 0:
            first_result = data['query']['search'][0]
            image_url = f'https://commons.wikimedia.org/wiki/File:{first_result["title"]}'
            # Display image using image_url
        else:
            # Handle case when no image is found