Search code examples
pythonfor-loopnested-loops

How to collect specific values in a deeply nested structure with Python


I'm trying to get a list of instance IDs that I get from the describe_instances call using boto3 api in my python script. For those of you who aren't aware of aws, I can post a detailed code after removing the specifics if you need it. I'm trying to access a item from a structure like this

   u'Reservations':[  
      {  
         u'Instances':[  
            {
              u'InstanceId':'i-0000ffffdd'
            },
            {  },   ### each of these dict contain a id like above
            {  },
            {  },
            {  }
         ]
      },
      {  
         u'Instances':[  
            {  },
            {  },
            {  },
            {  },
            {  }
         ]
      },
      {  
         u'Instances':[  
            {  }
         ]         
      }
]

I'm currently accessing it like

instanceLdict = []
instanceList = []
instances = []
for r in reservations:
  instanceList.append(r['Instances'])
for ilist in instanceList:
   for i in ilist:
       instanceLdict.append(i)
for i in instanceLdict:
    instances.append(i['InstanceId']) ####i need them in a list
print instances

fyi: my reservations variable contains the whole list of u'Reservations':

I feel this is inefficient and since I'm a python newbie I really think there must be some better way to do this rather than the multiple for and if. Is there a better way to do this? Kindly point to the structure/method etc., that might be useful in my scenario


Solution

  • Your solution is not actually that inefficient, except you don't really have to create all those top level lists just to save the instance ids in the end. What you could do is a nested loop and keep only what you need:

    instances = list()
    for r in reservations:
      for ilist in r['Instances']:
        for i in ilist:
          instances.append(i['InstanceId'])  # That's what you looping for
    

    Yes, there are ways to do this with shorter code, but explicit is better than implicit and stick to what you can read best. Python is quite good with iterations and remember maintainability first, performance second. Also, this part is hardly the bottleneck of what you doing after all those API calls, DB lookups etc.

    But if you really insist to do fancy one-liner, go have a look at itertools helpers, chain.from_iterable() is what you need:

    from itertools import chain
    instances = [i['InstanceId'] for i in chain.from_iterable(r['Instances'] for r in reservations)]