Search code examples
pythonjsonparsingfoursquare

Parsing lower levels in json file using python (foursquare API)?


Since I've just started with python, I don't think I've grasped the concept of parsing json responses very well yet and keep running into the same issue when I try to only print out certain parts of a json file. In the code below, I'm using the foursquare checkins API endpoint to return my checkin history (omitted the auth process for brevity):

    from rauth import OAuth2Service
    import json
    import pprint

    fs_checkins = session.get(endpoint, params = query_params)
    fs_checkin_data = json.loads(fs_checkins.content)

    pprint.pprint(fs_checkin_data)

This results in a json response that looks like:

    {u'response': {u'checkins': {u'count': 74,
                                 u'items': [{u'photos': {u'count': 0,
                                                         u'items': []},
                                             u'posts': {u'count': 0,
                                                        u'textCount': 0},
                                             u'source': {u'name': u'foursquare for iPhone',
                                                         u'url': u'https://foursquare.com/download/#/iphone'},
                                             u'timeZoneOffset': -240,
                                             u'type': u'checkin',
                                             u'venue': {u'beenHere': {u'count': 1,
                                                                      u'marked': False},
                                                        u'canonicalUrl': u'https://foursquare.com/v/nitehawk-cinema/4da491f6593f8eec9a257e35',
                                                        u'categories': [{u'icon': {u'prefix': u'https://foursquare.com/img/categories_v2/arts_entertainment/movietheater_',
                                                                                   u'suffix': u'.png'},
                                                                         u'id': u'4bf58dd8d48988d17f941735',
                                                                         u'name': u'Movie Theater',
                                                                         u'pluralName': u'Movie Theaters',
                                                                         u'primary': True,
                                                                         u'shortName': u'Movie Theater'}],
                                                        u'contact': {u'formattedPhone': u'(718) 384-3980',
                                                                     u'phone': u'7183843980'},
                                                        u'id': u'4da491f6593f8eec9a257e35',
                                                        u'like': False,
                                                        u'likes': {u'count': 114,
                                                                   u'groups': [{u'count': 114,
                                                                                u'items': [],
                                                                                u'type': u'others'}],
                                                                   u'summary': u'114 likes'},
                                                        u'location': {u'address': u'136 Metropolitan Ave.',
                                                                      u'cc': u'US',
                                                                      u'city': u'Brooklyn',
                                                                      u'country': u'United States',
                                                                      u'crossStreet': u'btwn Berry St. & Wythe Ave.',
                                                                      u'lat': 40.716219932353624,
                                                                      u'lng': -73.96228637176877,
                                                                      u'postalCode': u'11211',
                                                                      u'state': u'NY'},
                                                        u'name': u'Nitehawk Cinema',
                                                        u'stats': {u'checkinsCount': 11566,
                                                                   u'tipCount': 99,
                                                                   u'usersCount': 6003},
                                                        u'url': u'http://www.nitehawkcinema.com',
                                                        u'venuePage': {u'id': u'49722288'},
                                                        u'verified': True}}]}}}

I only want to parse out the 'canonicalUrl' and 'name' nested under 'venue' and understand the structure to be like so:

    response
       checkins
          items
             venue
                canonicalUrl                 
                name

I've tried for looping through fs_checkin_data['response']['checkins'] to append the 'items' blob into an empty list:

    items = []

    for item in fs_checkin_data['response']['checkins']:
        info = {}
        info['items'] = item['items']
        items.append(info)

Thinking I would then be able to for loop through that empty list to append the 'venue' blob into another empty list and finally be able to print out only the 'canonicalUrl' and 'name' (apologies for the ugly hacked-together logic as I was improvising since I don't know of another way of achieving the same result).

However, the above code resulted in this error:

          info['items'] = item['items']
    TypeError: string indices must be integers

Which I do not understand since when I do

   for item in fs_checkin_data['response']['checkins']: 
       pprint.pprint(item)

there is no issue going through that part of the json file.

I know there has to be a better way to do this, but I cannot seem to find a simple, working solution so any help would greatly be appreciated. Thank you.


Solution

  • You'll need to loop over items:

    for item in fs_checkin_data['response']['checkins']['items']:
        venue = item['venue']
        print venue['canonicalUrl'], venue['name']
    

    checkins is still itself a dictionary with only two keys, items and count. Looping over the checkins dictionary iterates over the keys, so in your code item is set to 'count', then 'items' (or vice versa). items, on the other hand, is a list of dictionaries, so looping over that list lets you access each individual item.