Search code examples
pythonjsonflickr

Nested Data into List


f = open("sample_diction.json","r")
sample_photo_rep = json.loads(f.read())
print sample_photo_rep
f.close()

Above is my code is working to open the file sample_diction.json and load its contents as a Python object into the variable sample_photo_rep. The next step I am trying to do is to write code to access the nested data inside sample_flickr_obj to create a list of all of the tags of that photo( data in “sample_diction.json”). Then I want to save the list of tags in a variable called sample_tags_list. Below is the data contained in sample_diction.json....I am just not sure where to begin. Any help would be appreciated.

{
  "photo": {
    "people": {
      "haspeople": 0
    }, 
    "dateuploaded": "1467709435", 
    "owner": {
      "username": "Ansel Adams",
      "realname": "", 
      "nsid": "48093195@N03", 
      "iconserver": "7332", 
      "location": "", 
      "path_alias": null, 
      "iconfarm": 8
    }, 
    "publiceditability": {
      "canaddmeta": 1, 
      "cancomment": 1
    }, 
    "id": "27820301400", 
    "title": {
      "_content": "Photo1"
    }, 
    "media": "photo", 
    "tags": {
      "tag": [
        {
          "machine_tag": false, 
          "_content": "nature",
          "author": "48093195@N03", 
          "raw": "Nature",
          "authorname": "ac | photo albums", 
          "id": "48070141-27820301400-5470"
        }, 
        {
          "machine_tag": false, 
          "_content": "mist",
          "author": "48093195@N03", 
          "raw": "Mist",
          "authorname": "ac | photo albums", 
          "id": "48070141-27820301400-852"
        }, 
        {
          "machine_tag": false, 
          "_content": "mountain",
          "author": "48093195@N03", 
          "raw": "Mountain",
          "authorname": "ac | photo albums", 
          "id": "48070141-27820301400-1695"
        }
      ]
    }, 
    "comments": {
      "_content": "0"
    }, 
    "secret": "c86034becf", 
    "usage": {
      "canblog": 0, 
      "canshare": 1, 
      "candownload": 0, 
      "canprint": 0
    }, 
    "description": {
      "_content": ""
    }, 
    "isfavorite": 0, 
    "views": "4", 
    "farm": 8, 
    "visibility": {
      "isfriend": 0, 
      "isfamily": 0, 
      "ispublic": 1
    }, 
    "rotation": 0, 
    "dates": {
      "taken": "2016-07-05 11:03:52", 
      "takenunknown": "1", 
      "takengranularity": 0, 
      "lastupdate": "1467709679", 
      "posted": "1467709435"
    }, 
    "license": "0", 
    "notes": {
      "note": []
    }, 
    "server": "7499", 
    "safety_level": "0", 
    "urls": {
      "url": [
        {
          "type": "photopage", 
          "_content": "https://www.flickr.com/photos/48093195@N03/27820301400/"
        }
      ]
    }, 
    "editability": {
      "canaddmeta": 0, 
      "cancomment": 0
    }
  }, 
  "stat": "ok"
}

Solution

  • The nested data might seem daunting at first, but it's pretty easy if you go step by step.

    You begin with sample_photo_re. It's a dict, and there's only one key.

    So :

    sample_photo_rep["photo"]
    

    It's another dict. Interesting information seems to be in tags :

    sample_photo_rep["photo"]["tags"]
    

    Yet another dict. So :

    sample_photo_rep["photo"]["tags"]["tag"]
    

    This time, it's a list, so you can iterate on it :

    for tag in sample_photo_rep["photo"]["tags"]["tag"]:
        print tag
    

    It outputs :

    {'machine_tag': False, '_content': 'nature', 'author': '48093195@N03', 'raw': 'Nature', 'authorname': 'ac | photo albums', 'id': '48070141-27820301400-5470'}
    {'machine_tag': False, '_content': 'mist', 'author': '48093195@N03', 'raw': 'Mist', 'authorname': 'ac | photo albums', 'id': '48070141-27820301400-852'}
    {'machine_tag': False, '_content': 'mountain', 'author': '48093195@N03', 'raw': 'Mountain', 'authorname': 'ac | photo albums', 'id': '48070141-27820301400-1695'}
    

    Those are all dicts. You might be just interested in raw key, so :

    for tag in sample_photo_rep["photo"]["tags"]["tag"]:
        print tag['raw']
    # Nature
    # Mist
    # Mountain
    

    Done!

    If you ever get an error (e.g. KeyError: 'row') go one step back and look at the data, possibly with type(object) to see if it's a list or a dict.

    UPDATE: To get [u'nature', u'mist', u'mountain'] :

    [unicode(tag) for tag in sample_photo_rep["photo"]["tags"]["tag"]]