Hi so I loaded a JSON file into a list using the following code:
import json
postal_mtl = ['H9W','H4W','H9P','H3B','H3A','H2Z','H3H','H3G','H3X','H9J','H1B','H1G','H1H','H4X','H2Y','H9R','H3Z','H3Y']
data = []
with open('business.json',encoding="utf8") as f:
for line in f:
data.append(json.loads(line))
Now I am trying to find the number of restaurants in montreal in this dataset (coming from Yelp). I tried the following code:
compteur3 = 0
for i in range(len(data)):
if data[i]['postal_code'][0:3] in postal_mtl and 'Restaurants' in data[i]['categories']:
compteur3 += 1
print(compteur3)
But I am getting an error saying "argument of type 'NoneType' is not iterable" I guess Python considers the date[i]['categories'] as a Nonetype ? Why is that ? If I enter the following I can see that it's clearly a string:
data[5]['categories']
'Shipping Centers, Couriers & Delivery Services, Local Services, Printing Services'
Now I just want to iterate over all the elements in my data list and find each line where we have the word 'Restaurants' (I got the Montreal stuff fixed)... Any idea ? Thanks !
Based on the code provided, it seems that the error is most likely coming from the if condition. Specifically, it is most likely coming from the statement 'Restaurants' in data[i]['categories']
. Under the hood, Python is trying to iterate through data[i]['categories']
to see if 'Restaurants' is in it. If data[i]['categories']
is None
, that would cause this error.
This may be caused by the JSON string not being formatted the way you expected. Perhaps, if no categories were listed in the 'Categories' field, a null
was put instead of an empty list. To check for this in your code, you can try the following:
compteur3 = 0
for i in range(len(data)):
is_inmontreal = data[i]['postal_code'][0:3] in postal_mtl
is_restaurant = data[i]['categories'] and 'Restaurants' in data[i]['categories']
if is_inmontreal and is_restaurant:
compteur3 += 1
print(compteur3)
Above, I simply split the condition into two parts. Functionally, this would be the same as having the conditions in one line, it just makes it slightly clearer. However, I also added a check in is_restaurant
to see if data[i]['categories']
has a positive truth value. In effect, this will check if the value is not None
and it is not an empty list. If you really want to be explicit, you can also do
is_restaurant = data[i]['categories'] is not None and 'Restaurants' in data[i]['categories']
Depending on how dirty the data is, you may need to go a little further than this and use exception handling. However, the above is just speculation as I do not know what the data looks like.