Search code examples
pythonlistpython-2.7sortedlist

What's a fast and pythonic/clean way of removing a sorted list from another sorted list in python?


I am creating a fast method of generating a list of primes in the range(0, limit+1). In the function I end up removing all integers in the list named removable from the list named primes. I am looking for a fast and pythonic way of removing the integers, knowing that both lists are always sorted.

I might be wrong, but I believe list.remove(n) iterates over the list comparing each element with n. meaning that the following code runs in O(n^2) time.

# removable and primes are both sorted lists of integers
for composite in removable:
    primes.remove(composite)

Based off my assumption (which could be wrong and please confirm whether or not this is correct) and the fact that both lists are always sorted, I would think that the following code runs faster, since it only loops over the list once for a O(n) time. However, it is not at all pythonic or clean.

i = 0
j = 0
while i < len(primes) and j < len(removable):
    if primes[i] == removable[j]:
        primes = primes[:i] + primes[i+1:]
        j += 1
    else:
        i += 1

Is there perhaps a built in function or simpler way of doing this? And what is the fastest way?

Side notes: I have not actually timed the functions or code above. Also, it doesn't matter if the list removable is changed/destroyed in the process.

For anyone interested the full functions is below:

import math

# returns a list of primes in range(0, limit+1)
def fastPrimeList(limit):
    if limit < 2:
        return list()
    sqrtLimit = int(math.ceil(math.sqrt(limit)))
    primes = [2] + range(3, limit+1, 2)
    index = 1
    while primes[index] <= sqrtLimit:
        removable = list()
        index2 = index
        while primes[index] * primes[index2] <= limit:
            composite = primes[index] * primes[index2]
            removable.append(composite)
            index2 += 1
        for composite in removable:
            primes.remove(composite)
        index += 1
    return primes

Solution

  • This is quite fast and clean, it does O(n) set membership checks, and in amortized time it runs in O(n) (first line is O(n) amortized, second line is O(n * 1) amortized, because a membership check is O(1) amortized):

    removable_set = set(removable)
    primes = [p for p in primes if p not in removable_set]
    

    Here is the modification of your 2nd solution. It does O(n) basic operations (worst case):

    tmp = []
    i = j = 0
    while i < len(primes) and j < len(removable):
        if primes[i] < removable[j]:
            tmp.append(primes[i])
            i += 1
        elif primes[i] == removable[j]:
            i += 1
        else:
            j += 1
    primes[:i] = tmp
    del tmp
    

    Please note that constants also matter. The Python interpreter is quite slow (i.e. with a large constant) to execute Python code. The 2nd solution has lots of Python code, and it can indeed be slower for small practical values of n than the solution with sets, because the set operations are implemented in C, thus they are fast (i.e. with a small constant).

    If you have multiple working solutions, run them on typical input sizes, and measure the time. You may get surprised about their relative speed, often it is not what you would predict.