Search code examples
pythonjsonpython-3.xfoursquare

Parse nested JSON from the Foursquare API


I'm attempting to parse a JSON response from Foursquare. It's nested in a way that I cannot figure out. Here's a copy of the entire JSON.

Here's a snippet of of the JSON:

{
"meta": {
    "code": 200,
    "requestId": "58cab8bc4434b959e2f68a69"
},
"response": {
    "categories": [
        {
            "categories": [
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
                        "suffix": ".png"
                    },
                    "id": "56aa371be4b08b9a8d5734db",
                    "name": "Amphitheater",
                    "pluralName": "Amphitheaters",
                    "shortName": "Amphitheater"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/aquarium_",
                        "suffix": ".png"
                    },
                    "id": "4fceea171983d5d06c3e9823",
                    "name": "Aquarium",
                    "pluralName": "Aquariums",
                    "shortName": "Aquarium"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/arcade_",
                        "suffix": ".png"
                    },
                    "id": "4bf58dd8d48988d1e1931735",
                    "name": "Arcade",
                    "pluralName": "Arcades",
                    "shortName": "Arcade"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/artgallery_",
                        "suffix": ".png"
                    },
                    "id": "4bf58dd8d48988d1e2931735",
                    "name": "Art Gallery",
                    "pluralName": "Art Galleries",
                    "shortName": "Art Gallery"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/bowling_",
                        "suffix": ".png"
                    },
                    "id": "4bf58dd8d48988d1e4931735",
                    "name": "Bowling Alley",
                    "pluralName": "Bowling Alleys",
                    "shortName": "Bowling Alley"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/casino_",
                        "suffix": ".png"
                    },
                    "id": "4bf58dd8d48988d17c941735",
                    "name": "Casino",
                    "pluralName": "Casinos",
                    "shortName": "Casino"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
                        "suffix": ".png"
                    },
                    "id": "52e81612bcbc57f1066b79e7",
                    "name": "Circus",
                    "pluralName": "Circuses",
                    "shortName": "Circus"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/comedyclub_",
                        "suffix": ".png"
                    },
                    "id": "4bf58dd8d48988d18e941735",
                    "name": "Comedy Club",
                    "pluralName": "Comedy Clubs",
                    "shortName": "Comedy Club"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/musicvenue_",
                        "suffix": ".png"
                    },
                    "id": "5032792091d4c4b30a586d5c",
                    "name": "Concert Hall",
                    "pluralName": "Concert Halls",
                    "shortName": "Concert Hall"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/performingarts_dancestudio_",
                        "suffix": ".png"
                    },
                    "id": "52e81612bcbc57f1066b79ef",
                    "name": "Country Dance Club",
                    "pluralName": "Country Dance Clubs",
                    "shortName": "Country Dance Club"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
                        "suffix": ".png"
                    },
                    "id": "52e81612bcbc57f1066b79e8",
                    "name": "Disc Golf",
                    "pluralName": "Disc Golf Courses",
                    "shortName": "Disc Golf"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
                        "suffix": ".png"
                    },
                    "id": "56aa371be4b08b9a8d573532",
                    "name": "Exhibit",
                    "pluralName": "Exhibits",
                    "shortName": "Exhibit"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
                        "suffix": ".png"
                    },
                    "id": "4bf58dd8d48988d1f1931735",
                    "name": "General Entertainment",
                    "pluralName": "General Entertainment",
                    "shortName": "Entertainment"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/racetrack_",
                        "suffix": ".png"
                    },
                    "id": "52e81612bcbc57f1066b79ea",
                    "name": "Go Kart Track",
                    "pluralName": "Go Kart Tracks",
                    "shortName": "Go Kart"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/historicsite_",
                        "suffix": ".png"
                    },
                    "id": "4deefb944765f83613cdba6e",
                    "name": "Historic Site",
                    "pluralName": "Historic Sites",
                    "shortName": "Historic Site"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/nightlife/karaoke_",
                        "suffix": ".png"
                    },
                    "id": "5744ccdfe4b0c0459246b4bb",
                    "name": "Karaoke Box",
                    "pluralName": "Karaoke Boxes",
                    "shortName": "Karaoke"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
                        "suffix": ".png"
                    },
                    "id": "52e81612bcbc57f1066b79e6",
                    "name": "Laser Tag",
                    "pluralName": "Laser Tag Places",
                    "shortName": "Laser Tag"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/historicsite_",
                        "suffix": ".png"
                    },
                    "id": "5642206c498e4bfca532186c",
                    "name": "Memorial Site",
                    "pluralName": "Memorial Sites",
                    "shortName": "Memorial Site"
                },
                {
                    "categories": [],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/parks_outdoors/golfcourse_",
                        "suffix": ".png"
                    },
                    "id": "52e81612bcbc57f1066b79eb",
                    "name": "Mini Golf",
                    "pluralName": "Mini Golf Courses",
                    "shortName": "Mini Golf"
                },
                {
                    "categories": [
                        {
                            "categories": [],
                            "icon": {
                                "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/movietheater_",
                                "suffix": ".png"
                            },
                            "id": "56aa371be4b08b9a8d5734de",
                            "name": "Drive-in Theater",
                            "pluralName": "Drive-in Theaters",
                            "shortName": "Drive-in Theater"
                        },
                        {
                            "categories": [],
                            "icon": {
                                "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/movietheater_",
                                "suffix": ".png"
                            },
                            "id": "4bf58dd8d48988d17e941735",
                            "name": "Indie Movie Theater",
                            "pluralName": "Indie Movie Theaters",
                            "shortName": "Indie Movies"
                        },
                        {
                            "categories": [],
                            "icon": {
                                "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/movietheater_",
                                "suffix": ".png"
                            },
                            "id": "4bf58dd8d48988d180941735",
                            "name": "Multiplex",
                            "pluralName": "Multiplexes",
                            "shortName": "Cineplex"
                        }
                    ],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/movietheater_",
                        "suffix": ".png"
                    },
                    "id": "4bf58dd8d48988d17f941735",
                    "name": "Movie Theater",
                    "pluralName": "Movie Theaters",
                    "shortName": "Movie Theater"
                },
                {
                    "categories": [
                        {
                            "categories": [],
                            "icon": {
                                "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/museum_art_",
                                "suffix": ".png"
                            },
                            "id": "4bf58dd8d48988d18f941735",
                            "name": "Art Museum",
                            "pluralName": "Art Museums",
                            "shortName": "Art Museum"
                        },
                        {
                            "categories": [],
                            "icon": {
                                "prefix": "https://ss3.4sqi.net/img/categories_v2/nightlife/stripclub_",
                                "suffix": ".png"
                            },
                            "id": "559acbe0498e472f1a53fa23",
                            "name": "Erotic Museum",
                            "pluralName": "Erotic Museums",
                            "shortName": "Erotic Museum"
                        },
                        {
                            "categories": [],
                            "icon": {
                                "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/museum_history_",
                                "suffix": ".png"
                            },
                            "id": "4bf58dd8d48988d190941735",
                            "name": "History Museum",
                            "pluralName": "History Museums",
                            "shortName": "History Museum"
                        },
                        {
                            "categories": [],
                            "icon": {
                                "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/museum_planetarium_",
                                "suffix": ".png"
                            },
                            "id": "4bf58dd8d48988d192941735",
                            "name": "Planetarium",
                            "pluralName": "Planetariums",
                            "shortName": "Planetarium"
                        },
                        {
                            "categories": [],
                            "icon": {
                                "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/museum_science_",
                                "suffix": ".png"
                            },
                            "id": "4bf58dd8d48988d191941735",
                            "name": "Science Museum",
                            "pluralName": "Science Museums",
                            "shortName": "Science Museum"
                        }
                    ],
                    "icon": {
                        "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/museum_",
                        "suffix": ".png"
                    },
                    "id": "4bf58dd8d48988d181941735",
                    "name": "Museum",
                    "pluralName": "Museums",
                    "shortName": "Museum"
                },
            "icon": {
                "prefix": "https://ss3.4sqi.net/img/categories_v2/arts_entertainment/default_",
                "suffix": ".png"
            },
            "id": "4d4b7104d754a06370d81259",
            "name": "Arts & Entertainment",
            "pluralName": "Arts & Entertainment",
            "shortName": "Arts & Entertainment"
        },

My code pulls the first hierarchy, which is always listed below it's sub-categories.

import urllib.request
import json 
import sqlite3
from key import ID, SECRET

CLIENT_ID = ID
CLIENT_SECRET = SECRET
v = '20170315'

url = 'https://api.foursquare.com/v2/venues/categories?client_id='+ CLIENT_ID +'&client_secret=' + SECRET + '&v=' + v

contents = urllib.request.urlopen(url).read()

parsed = json.loads(contents)


clean = parsed['response']['categories']
my_list = [i['name'] for i in clean]
print(my_list)

Output:

['Arts & Entertainment', 'College & University', 'Event', 'Food', 'Nightlife Spot', 'Outdoors & Recreation', 'Professional & Other Places', 'Residence', 'Shop & Service', 'Travel & Transport']

I'm having trouble parsing to get the sub-categories. I'm trying to pull id and name for all categories, sub or not.


Solution

  • If a data structure is recursively nested, a recursive function is often the easiest way to parse it:

    def get_categories(data):
        result = {}
        for cat in data:
            result[cat['id']] = cat['name']
            if cat['categories']:
                result.update(get_categories(cat['categories']))
        return result
    

    This returns a dictionary of id: name key/value pairs, recursively calling itself and updating result with any subcategories it finds along the way.

    The if check is not strictly necessary, since calling the function with an empty list would simply return an empty dictionary, but it saves a lot of pointless recursive calls, so ought to improve performance.

    Here's how you'd use it:

    categories = get_categories(parsed['response']['categories'])
    

    … and here's the result:

    >>> from pprint import pprint
    >>> pprint(categories)
    {'4bf58dd8d48988d100941735': 'Meeting Room',
     '4bf58dd8d48988d100951735': 'Pet Store',
     '4bf58dd8d48988d101941735': 'Martial Arts Dojo',
       # ...
     '57558b36e4b065ecebd306da': 'Savoyard Restaurant',
     '57558b36e4b065ecebd306dd': 'Truck Stop',
     '589ddde98ae3635c072819ee': 'Duty-free Shop'}