I am quite new in Python and I am trying to run a piece of code that can check the number of articles mentioning a keyword in The Guardian's newspaper.
It is something that I have already done with the New York Times, but when I try to apply the same code to The Guardian, it says that line 8 (the one below) fails:
hits = json_res["response"]["meta"]["hits"]
This is the entire piece of code I have problems with (I have previously defined the guardian_api_key):
#Checking number of articles on a topic in The Guardian
import requests
def number_of_articles(api_key,keyword,begin_date="19800101",end_date="20240101"):
url_query = f"http://content.guardianapis.com/#/search?q={keyword}&begin_date={begin_date}&end_date={end_date}&api-key={api_key}"
print (url_query)
res = requests.get(url_query)
json_res = res.json()
hits = json_res["response"]["meta"]["hits"]
return hits
print(number_of_articles(api_key=guardian_api_key,keyword="vote"))
I would love to have some help so that I can understand what is going on! Please tell me if my question is unclear or what could I try.
You have a typo in the Gurdians URL, change it from
f"http://content.guardianapis.com/#/search?
to
f"https://content.guardianapis.com/search?
In contrast to The Times, the number of articles found by the search is stored in
res["response"]["total"]
If you are open to suggestions, I would write your code more modularly, allowing you to add further APIs to your system.
import requests
class BaseParser:
def __init__(self, key):
self.key = key
self.base_url= ""
def query(self, params: dict) -> dict:
query = self.base_url
for key,val in params.items():
query += f"&{key}={val}"
return requests.get(query).json()
class GuardianParser(BaseParser):
def __init__(self, key):
super().__init__(key)
self.base_url= f"https://content.guardianapis.com/search?api-key={self.key}"
def number_of_articles(self, params: dict) -> int:
res = self.query(params)
return res["response"]["total"]
class TimesParser(BaseParser):
def __init__(self, key):
super().__init__(key)
self.base_url= f"the api url from the times"
def number_of_articles(self, params: dict):
res = self.query(params)
return res["response"]["meta"]["hits"]
search_dict = {
"keyword":"vote",
"begin_date":19800101,
"end_date":20240101
}
gparser = GuardianParser(gkey)
gart = gparser.number_of_articles(search_dict)
tparser = TimesParser(tkey)
tart = gparser.number_of_articles(search_dict)
print(f"Guardian number of articles: {gart}")
print(f"The Times number of articles: {tart}")
In addition, I would add some error handling, too (e.g. expiring API key, bad request, unreachable API etc.). This would go beyond the scope of the question, tho.