Search code examples
pythonjsonapi-key

How to check the number of articles with a keyword in The Guardian newspaper using API key?


I am quite new in Python and I am trying to run a piece of code that can check the number of articles mentioning a keyword in The Guardian's newspaper.

It is something that I have already done with the New York Times, but when I try to apply the same code to The Guardian, it says that line 8 (the one below) fails:

hits = json_res["response"]["meta"]["hits"]

This is the entire piece of code I have problems with (I have previously defined the guardian_api_key):

#Checking number of articles on a topic in The Guardian
import requests
def number_of_articles(api_key,keyword,begin_date="19800101",end_date="20240101"):
    url_query = f"http://content.guardianapis.com/#/search?q={keyword}&begin_date={begin_date}&end_date={end_date}&api-key={api_key}"
    print (url_query)
    res = requests.get(url_query)
    json_res = res.json()
    hits = json_res["response"]["meta"]["hits"]
    return hits

print(number_of_articles(api_key=guardian_api_key,keyword="vote"))

I would love to have some help so that I can understand what is going on! Please tell me if my question is unclear or what could I try.


Solution

  • You have a typo in the Gurdians URL, change it from

    f"http://content.guardianapis.com/#/search?

    to

    f"https://content.guardianapis.com/search?

    In contrast to The Times, the number of articles found by the search is stored in

    res["response"]["total"]

    If you are open to suggestions, I would write your code more modularly, allowing you to add further APIs to your system.

    import requests
    class BaseParser:
        def __init__(self, key):
            self.key = key
            self.base_url= ""
    
        def query(self, params: dict) -> dict:
            query = self.base_url
            for key,val in params.items():
                query += f"&{key}={val}"
            return requests.get(query).json()
    
    class GuardianParser(BaseParser):
    
        def __init__(self, key):
            super().__init__(key)
            self.base_url= f"https://content.guardianapis.com/search?api-key={self.key}"
    
        def number_of_articles(self, params: dict) -> int:
            res = self.query(params)
            return res["response"]["total"]
    
    class TimesParser(BaseParser):
    
        def __init__(self, key):
            super().__init__(key)
            self.base_url= f"the api url from the times"
    
        def number_of_articles(self, params: dict):
            res = self.query(params)
            return res["response"]["meta"]["hits"]
    
    search_dict = {
        "keyword":"vote",
        "begin_date":19800101,
        "end_date":20240101
    }
    
    gparser = GuardianParser(gkey)
    gart = gparser.number_of_articles(search_dict)
    tparser = TimesParser(tkey)
    tart = gparser.number_of_articles(search_dict)
    print(f"Guardian number of articles: {gart}")
    print(f"The Times number of articles: {tart}")
    

    In addition, I would add some error handling, too (e.g. expiring API key, bad request, unreachable API etc.). This would go beyond the scope of the question, tho.