Search code examples
pythonsortingdouble

Sort by one element in list of lists of fixed size and remove doubles


I would like to find the best way to achieve this. (I already can do it manually but i am looking for a librairy that do this directly).

I want to sort a list of lists based on the first index and remove doubles of first index.

sections = [
    [1, 4, 1],
    [5, 3, 2],  # Sort on the first element of sublist
    [2, 2, 3],
    [2, 1, 4],  # Double to remove
]

assertion = [
    [1, 4, 1],
    [2, 2, 3],
    [5, 3, 2],
]

def sort_first_and_remove_double(sections):
    
    return result


assert sort_first_and_remove_double(sections) == assertion

The goal is to not coding any loop directly but do it in one or two lines using librairies, for optimisation reasons, my datas are huges.

My attempt code is here:

def sort_first_and_remove_double(sections):
    sections = remove_double(sections)
    sections = sort_sections(sections)
    return sections


def sort_sections(sections):
    ids = get_ids(sections)
    return [x for _, x in sorted(zip(ids, sections))]


def remove_double(sections):
    ids = get_ids(sections)
    keeped_sections = []
    for i, id_check in enumerate(ids):
        ids_keeped = get_ids(keeped_sections)
        if id_check not in ids_keeped:
            keeped_sections.append(sections[i])
    return keeped_sections


def get_ids(sections):
    return [section[0] for section in sections]

But it is very slow!

Edit: Here an update of times:

t0 = time.perf_counter()
sort_first_and_remove_double(lst)
print(time.perf_counter() - t0)  # 7.719699351582676e-05 s
t0 = time.perf_counter()
list(dict(map(lambda x: [x[1][0], x[1]], sorted(enumerate(lst), key=lambda pair: (pair[1][0], -pair[0])))).values())
print(time.perf_counter() - t0)  # 7.587004802189767e-06 s
t0 = time.perf_counter()
sorted(dict(map(lambda x: (x[0], x), lst[::-1])).values())
print(time.perf_counter() - t0)  # 3.0700030038133264e-06 s

Solution

  • Are you wont solution like this?

    sorted({x[0]: x for x in sections[::-1]}.values())
    

    or without for

    sorted(dict(map(lambda x: (x[0], x), sections[::-1])).values())