python python-3.x list multidimensional-array binary-search

Binary search and return list from a 2D list in Python

I am new to Python programming and I've been trying this question for a long time but I wasn't able to come up with a solution. The question is that I'm given a list of employee IDs and the birth year of the employees and that I'm supposed to return a list of employee IDs that was born in the input year. The following is an example of the list.

lst = [[2, 1987], [4, 1990], [6, 1992]...]

Linear searching requires too much time as there are 1 million employees on the list. Therefore, I think the question requires us to use a binary search. I am supposed to come up with a function that takes the input year and lst to come up with a list that returns the employee IDs. If there are no employees born in the year, it should return an empty list. The following is an example of the function.

input > get_IDs(2012)

output > [102, 204, 199632, 199649]

I understand that there's inbuilt bisect function which I was able to do for 1D arrays but I'm not sure how to do a binary search for 2D arrays. Please advice on what I should do and any suggestions are largely appreciated!

Solution

For binary search to work, you must have a sorted list. Sorting comes at a cost; it represents a O(nlogn) time complexity.

However, this can be done without binary search, with linear time complexity, if you use a dictionary, keyed by year:

from collections import defaultdict 

# preprocessing
dic = defaultdict(list) # silently create an empty list when addressing a year the first time
for id, year in lst:
    dic[year].append(id)

# implementation of the required function 
def get_IDs(year):
    return dic[year]