Imagine we have a finite number of districts with a finite number of houses in each of them. Each house has a number and the houses in each district are numbered from 1. One man and one woman live in each house.
We have the following class for representation people:
class Person:
def __init__(self, name, age, district, house_number):
self.name = name
self.age = age
self.district = district
self.house_number = house_number
And we have two lists with objects of this class called men
and women
. To understand the structure of lists below is an example of adding an object to a list.
men.append(Person("Alex", 22, "District 7", 71))
It is considered that the lists are already filled with objects. So we have all men in the men
list and all women in the women
list. Since there are a finite number of districts, finite number of houses in each of them and each house has one man and one woman, the lengths of the lists are equal. The objects in both lists are randomized.
It is assumed that the amount of data is very large.
The goal of the problem is to find all men over a certain age (variable min_age
) in the men
list and to match each of them with a woman from the women
list who lives in the same house with him.
All men found must be in the men_new
list and women in the women_new
list. The lists must be comparable so a man and a woman living in the same house must have the same index in the men_new
and women_new
lists.
I now have the following solution:
# We believe that lists "men", "women" and variable "min_age" are previously defined.
men_new = []
women_new = []
for man in men:
if man.age > min_age:
men_new.append(man)
for man in men_new:
women_new.append(filter(lambda x: x.district == man.district and x.house_number == man.house_number, women))
This solution works great, but it is very slow with large amounts of data. Are there any ways to solve this problem faster? Thanks in advance!
Transform your women list into a dict mapping house number to woman:
house_to_woman = {}
for w in women:
house_to_woman[w.house_number] = w
Then you can make the last line of your code efficient using this mapping:
for m in men_new:
women_new.append(house_to_woman[m.house_number])