Search code examples
jsonpandasfoursquarefolium

Convert json file to dataframe in python and FourSqaure API


I'm completing this IBM Data Science certification on Coursera and one of the assignments require us to replicate this link- https://rawnote.dinhanhthi.com/files/ibm/neighborhoods_in_toronto.

I'm fairly new to this so I was going through the link to understand it and I couldn't understand some parts of the code.

So the objective of this assignment is to:

  1. Extract a table from wikipedia and store it in a dataframe
  2. Create a map of toronto city and explore the boroughs that contain "Toronto"
  3. Explore any random neighborhood in Toronto using the FourSqaure API ("The Beaches" have been chosen here)
  4. Get the top 100 venues that are in "The Beaches" within a radius of 500 meters.

They've done the 4th point using the FourSqaure API as shown below:

LIMIT = 100 # limit of number of venues returned by Foursquare API
radius = 500 # define radius
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&limit={}'.format(
    CLIENT_ID, 
    CLIENT_SECRET, 
    VERSION, 
    neighborhood_latitude, 
    neighborhood_longitude, 
    radius, 
    LIMIT)

# get the result to a json file
results = requests.get(url).json()

The "results" variable looks like this:

{'meta': {'code': 200, 'requestId': '5eda4fb9aba297001b2f6207'},
 'response': {'headerLocation': 'The Beaches',
  'headerFullLocation': 'The Beaches, Toronto',
  'headerLocationGranularity': 'neighborhood',
  'totalResults': 4,
  'suggestedBounds': {'ne': {'lat': 43.680857404499996,
    'lng': -79.28682091449052},
   'sw': {'lat': 43.67185739549999, 'lng': -79.29924148550948}},
  'groups': [{'type': 'Recommended Places',
    'name': 'recommended',
    'items': [{'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4bd461bc77b29c74a07d9282',
       'name': 'Glen Manor Ravine',
       'location': {'address': 'Glen Manor',
        'crossStreet': 'Queen St.',
        'lat': 43.67682094413784,
        'lng': -79.29394208780985,
        'labeledLatLngs': [{'label': 'display',
          'lat': 43.67682094413784,
          'lng': -79.29394208780985}],
        'distance': 89,
        'cc': 'CA',
        'city': 'Toronto',
        'state': 'ON',
        'country': 'Canada',
        'formattedAddress': ['Glen Manor (Queen St.)',
         'Toronto ON',
         'Canada']},
       'categories': [{'id': '4bf58dd8d48988d159941735',
         'name': 'Trail',
         'pluralName': 'Trails',
         'shortName': 'Trail',
         'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/parks_outdoors/hikingtrail_',
          'suffix': '.png'},
         'primary': True}],
       'photos': {'count': 0, 'groups': []}},
      'referralId': 'e-0-4bd461bc77b29c74a07d9282-0'},
     {'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4ad4c062f964a52011f820e3',
       'name': 'The Big Carrot Natural Food Market',
       'location': {'address': '125 Southwood Dr',
        'lat': 43.678879,
        'lng': -79.297734,
        'labeledLatLngs': [{'label': 'display',
          'lat': 43.678879,
          'lng': -79.297734}],
        'distance': 471,
        'postalCode': 'M4E 0B8',
        'cc': 'CA',
        'city': 'Toronto',
        'state': 'ON',
        'country': 'Canada',
        'formattedAddress': ['125 Southwood Dr',
         'Toronto ON M4E 0B8',
         'Canada']},
       'categories': [{'id': '50aa9e744b90af0d42d5de0e',
         'name': 'Health Food Store',
         'pluralName': 'Health Food Stores',
         'shortName': 'Health Food Store',
         'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/shops/food_grocery_',
          'suffix': '.png'},
         'primary': True}],
       'photos': {'count': 0, 'groups': []},
       'venuePage': {'id': '75150878'}},
      'referralId': 'e-0-4ad4c062f964a52011f820e3-1'},
     {'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4b8daea1f964a520480833e3',
       'name': 'Grover Pub and Grub',
       'location': {'address': '676 Kingston Rd.',
        'crossStreet': 'at Main St.',
        'lat': 43.679181434941015,
        'lng': -79.29721535878515,
        'labeledLatLngs': [{'label': 'display',
          'lat': 43.679181434941015,
          'lng': -79.29721535878515}],
        'distance': 460,
        'postalCode': 'M4E 1R4',
        'cc': 'CA',
        'city': 'Toronto',
        'state': 'ON',
        'country': 'Canada',
        'formattedAddress': ['676 Kingston Rd. (at Main St.)',
         'Toronto ON M4E 1R4',
         'Canada']},
       'categories': [{'id': '4bf58dd8d48988d11b941735',
         'name': 'Pub',
         'pluralName': 'Pubs',
         'shortName': 'Pub',
         'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/nightlife/pub_',
          'suffix': '.png'},
         'primary': True}],
       'photos': {'count': 0, 'groups': []}},
      'referralId': 'e-0-4b8daea1f964a520480833e3-2'},
     {'reasons': {'count': 0,
       'items': [{'summary': 'This spot is popular',
         'type': 'general',
         'reasonName': 'globalInteractionReason'}]},
      'venue': {'id': '4df91c4bae60f95f82229ad5',
       'name': 'Upper Beaches',
       'location': {'lat': 43.68056321147582,
        'lng': -79.2928688743688,
        'labeledLatLngs': [{'label': 'display',
          'lat': 43.68056321147582,
          'lng': -79.2928688743688}],
        'distance': 468,
        'cc': 'CA',
        'city': 'Toronto',
        'state': 'ON',
        'country': 'Canada',
        'formattedAddress': ['Toronto ON', 'Canada']},
       'categories': [{'id': '4f2a25ac4b909258e854f55f',
         'name': 'Neighborhood',
         'pluralName': 'Neighborhoods',
         'shortName': 'Neighborhood',
         'icon': {'prefix': 'https://ss3.4sqi.net/img/categories_v2/parks_outdoors/neighborhood_',
          'suffix': '.png'},
         'primary': True}],
       'photos': {'count': 0, 'groups': []}},
      'referralId': 'e-0-4df91c4bae60f95f82229ad5-3'}]}]}}

I'm not sure how to proceed. The below image is what is mentioned in the link but:

  1. I don't understand why they've created a function get_category_row?
  2. Why are we writing venues = results['response']['groups'][0]['items']? Isn't json_normalize() supposed to convert a json file to a datframe? So why cant we directly do json_normalize(results)?

enter image description here

I'm pretty much lost from section 4.6 onwards in the link.

if anyone could help me out or guide me that would be amazing! :)


Solution

  • json_normalize will only flatten the records in one path, for example in your json, you can flatten each path separately:

    meta
    response -> suggestedBounds
    response -> groups -> items
    

    And then you'd have to merge them together

    df1 = pd.json_normalize(d['response'], record_path=['groups', 'items'], meta=[])
    print(df1)
    df2 = pd.json_normalize(d['response'])
    print(df2)
    df3 = pd.json_normalize(d['meta'])
    print(df3)
    
    
                           referralId  reasons.count  ... venue.location.postalCode venue.venuePage.id
    0  e-0-4bd461bc77b29c74a07d9282-0              0  ...                       NaN                NaN
    1  e-0-4ad4c062f964a52011f820e3-1              0  ...                   M4E 0B8           75150878
    2  e-0-4b8daea1f964a520480833e3-2              0  ...                   M4E 1R4                NaN
    3  e-0-4df91c4bae60f95f82229ad5-3              0  ...                       NaN                NaN
    
    [4 rows x 21 columns]
      headerLocation    headerFullLocation headerLocationGranularity  ...  suggestedBounds.ne.lng suggestedBounds.sw.lat  suggestedBounds.sw.lng
    0    The Beaches  The Beaches, Toronto              neighborhood  ...              -79.286821              43.671857              -79.299241
    
    [1 rows x 9 columns]
       code                 requestId
    0   200  5eda4fb9aba297001b2f6207
    

    If you want to flatten the full json, you can try flatten_json. Documentation: Flatten JSON