Search code examples
pythondictionary

Most Pythonic method of conditionally extracting data from multiple lists of dictionaries


I'm trying to use two lists of dictionaries to build an object, the dictionaries are built from a TAP connection to two different databases. Because of the data sources I cannot guarantee that any of the dictionaries will contain the information I need, so I've chosen a primary dictionary and if the information is not in there then I extract it from the second dictionary.

Because I'm pulling the data from two different data sources, the field names from TAP differ for both sources, so I can't just do an intersection of the dictionaries.

At the moment I can get it to work, but I'm not happy with the solution:

for result in eeuresults:
    name=result['target_name']
    axes=result['semi_major_axis']
    period=result['period']
    radius=result['radius']

    if math.isnan(period):
        for r in eparesults:
            if r['pl_name'] == name: period=r['pl_orbper']
    if math.isnan(axes):
        for r in eparesults:
            if r['pl_name'] == name: axes=r['pl_orbsmax']        
    if math.isnan(radius):
        for r in eparesults:
            if r['pl_name'] == name: radius=r['pl_radj']

I've tried using dictionary.get() to make it simpler, but it falls down if the value isn't in the second list of dictionaries.

axes=result.get('semi_major_axis',[r['pl_orbper'] for r in eparesults if r['pl_name']==name][0])

Edit: Solution Chosen

I ended up using rioV8's solution to reduce code duplication, it's not the most efficient solution, but it is elegant, readable and expandable; which is what I really wanted. Full function is below:

def getsystemdata(name, epaname=None):
    def search_epa(name, value, key):
        if math.isnan(value):
            for r in eparesults:
                if r['pl_name'] == name: return r[key]
        return value
            
    if epaname == None: epaname=name
    service=pyvo.dal.TAPService("https://exoplanetarchive.ipac.caltech.edu/TAP")
    eparesults=service.search(f"select pl_name, pl_radj, pl_orbper, pl_orbsmax from pscomppars where hostname = '{epaname}' ")
    
    service=pyvo.dal.TAPService("http://voparis-tap-planeto.obspm.fr/tap")
    eeuresults=service.search(f"select target_name, radius, period, semi_major_axis from exoplanet.epn_core where star_name = '{name}'")
    
    planets=[]
    for result in eeuresults:
        name=result['target_name']
        axes=search_epa(name, result['semi_major_axis'], 'pl_orbsmax')
        period=search_epa(name, result['period'], 'pl_orbper')
        radius=search_epa(name, result['radius'], 'pl_radj')

        planets.append(Planet(name.split(' ')[-1], axes, period, radius))

    return planets

Solution

  • remove code duplication

    def find_in_eparesults(name, value, epa_key):
      if math.isnan(value):
        for r in eparesults:
          if r['pl_name'] == name:
            value = r[epa_key]
            break
      return value
    
    for result in eeuresults:
        name=result['target_name']
        axes=find_in_eparesults(name, result['semi_major_axis'], 'pl_orbper')
        period=find_in_eparesults(name, result['period'], 'pl_orbsmax')
        radius=find_in_eparesults(name, result['radius'], 'pl_radj')