Search code examples
pythonfor-loopnested-loops

Python - Better way to iterate complex list than nested for loops


I am iterating a complex list in python with this format:

[
  {
    <parent id>: [
      <child id>,
      <child id>
    ],
    <parent id>: [
      <child id>
    ]
  },
  {
    <parent id>: [
      <child id>,
      <child id>,
      <child id>
    ],
    <parent id>: [
      <child id>
    ]
  }
]

The list will have dict as elements. These dict have keys of <parent id> and values of a list of <child id>

There can be the same <parent id> in different dict, but <child id> can only belong to one <parent id>. An example is this:

[
  {
    2: [1, 5],
    3: [3, 7],
    4: [6]
  },
  {
    1: [2, 4, 8],
    4: [6]
  }
]

Parent id 4 is in both dict elements but all child ids are unique to a parent id.

Now I am iterating this data structure as input because I want to make sure that the condition of all childs being unique to a parent id is met. This is my code:

def check_format(self, mapping):
    # mapping is the data structure
    unique_parent_list = []
    child_list = []

    for x in range(0, 2):
        for parent in mapping[x].keys():
            if parent not in unique_parent_list:
                unique_parent_list.append(parent)
                for child in mapping[x][parent]:
                    child_list.append(child)
    if len(child_list) > len(set(child_list)):
        return 'Condition not met'
    else:
        return 'Condition met'

This works, but I do not like that it is O^4 complexity or something like that. Is there a way to simplify or code this for better performance?


Solution

  • You clearly have a mapping relationship from child to parent. The easiest thing I can think of is just to make a dict with children as keys. If you encounter a child that's already in, check the parent value.

    The lookup and insertion happen in constant time (dict keys are effectively a hash set). You can also use short circuiting more effectively since you can stop the moment you find a child with multiple parents:

    def check_format(map_list):
        check = {}
        for parent, children in (i  for d in map_list for i in d.items()):
            for child in children:
                if check.setdefault(child, parent) != parent:
                    return False
        return True
    

    This will iterate exactly once per child and perform a constant-time (ideally) operation on each using dict.setdefault.