Since I've just started with python, I don't think I've grasped the concept of parsing json responses very well yet and keep running into the same issue when I try to only print out certain parts of a json file. In the code below, I'm using the foursquare checkins API endpoint to return my checkin history (omitted the auth process for brevity):
from rauth import OAuth2Service
import json
import pprint
fs_checkins = session.get(endpoint, params = query_params)
fs_checkin_data = json.loads(fs_checkins.content)
pprint.pprint(fs_checkin_data)
This results in a json response that looks like:
{u'response': {u'checkins': {u'count': 74,
u'items': [{u'photos': {u'count': 0,
u'items': []},
u'posts': {u'count': 0,
u'textCount': 0},
u'source': {u'name': u'foursquare for iPhone',
u'url': u'https://foursquare.com/download/#/iphone'},
u'timeZoneOffset': -240,
u'type': u'checkin',
u'venue': {u'beenHere': {u'count': 1,
u'marked': False},
u'canonicalUrl': u'https://foursquare.com/v/nitehawk-cinema/4da491f6593f8eec9a257e35',
u'categories': [{u'icon': {u'prefix': u'https://foursquare.com/img/categories_v2/arts_entertainment/movietheater_',
u'suffix': u'.png'},
u'id': u'4bf58dd8d48988d17f941735',
u'name': u'Movie Theater',
u'pluralName': u'Movie Theaters',
u'primary': True,
u'shortName': u'Movie Theater'}],
u'contact': {u'formattedPhone': u'(718) 384-3980',
u'phone': u'7183843980'},
u'id': u'4da491f6593f8eec9a257e35',
u'like': False,
u'likes': {u'count': 114,
u'groups': [{u'count': 114,
u'items': [],
u'type': u'others'}],
u'summary': u'114 likes'},
u'location': {u'address': u'136 Metropolitan Ave.',
u'cc': u'US',
u'city': u'Brooklyn',
u'country': u'United States',
u'crossStreet': u'btwn Berry St. & Wythe Ave.',
u'lat': 40.716219932353624,
u'lng': -73.96228637176877,
u'postalCode': u'11211',
u'state': u'NY'},
u'name': u'Nitehawk Cinema',
u'stats': {u'checkinsCount': 11566,
u'tipCount': 99,
u'usersCount': 6003},
u'url': u'http://www.nitehawkcinema.com',
u'venuePage': {u'id': u'49722288'},
u'verified': True}}]}}}
I only want to parse out the 'canonicalUrl'
and 'name'
nested under 'venue'
and understand the structure to be like so:
response
checkins
items
venue
canonicalUrl
name
I've tried for looping through fs_checkin_data['response']['checkins']
to append the 'items'
blob into an empty list:
items = []
for item in fs_checkin_data['response']['checkins']:
info = {}
info['items'] = item['items']
items.append(info)
Thinking I would then be able to for loop through that empty list to append the 'venue'
blob into another empty list and finally be able to print out only the 'canonicalUrl'
and 'name'
(apologies for the ugly hacked-together logic as I was improvising since I don't know of another way of achieving the same result).
However, the above code resulted in this error:
info['items'] = item['items']
TypeError: string indices must be integers
Which I do not understand since when I do
for item in fs_checkin_data['response']['checkins']:
pprint.pprint(item)
there is no issue going through that part of the json file.
I know there has to be a better way to do this, but I cannot seem to find a simple, working solution so any help would greatly be appreciated. Thank you.
You'll need to loop over items
:
for item in fs_checkin_data['response']['checkins']['items']:
venue = item['venue']
print venue['canonicalUrl'], venue['name']
checkins
is still itself a dictionary with only two keys, items
and count
. Looping over the checkins
dictionary iterates over the keys, so in your code item
is set to 'count'
, then 'items'
(or vice versa). items
, on the other hand, is a list of dictionaries, so looping over that list lets you access each individual item.