Search code examples
pythonarraysalgorithmsub-array

find minimum-length subarray that has all numbers


File input.txt consists of two lines: first has integer number N space then integer number K (1 ≤ N,K ≤ 250000). Second has N space-delimeted integers, where each integer is less than or equal to K. It is guaranteed that each integer from 1 to K is in the array. The task is to find subarray of minimum length, that contains all integers. And print its start and end. Note, that indexing starts from 1.

Examples:

Input         Output
5 3           2 4
1 2 1 3 2

6 4           2 6
2 4 2 3 3 1  

I had this task in a recent programming competition. It is over, and I am not cheating. I've implemented it using python 3:

with open('input.txt') as file:
    N, K = [int(x) for x in file.readline().split()]
    alley = [int(x) for x in file.readline().split()]

trees = {}
min_idx = (1, N)
min_length = N
for i in range(N):
    trees[alley[i]] = i
    if len(trees) == K:
        idx = (min(trees.values())+1, max(trees.values())+1)
        length = idx[1] - idx[0] + 1
        if length < min_length:
            min_idx = idx
            min_length = length
        if min_length == K:
            break


print (str(min_idx[0]) + " " + str(min_idx[1]))

The idea is to save last position of i-th tree into a dictionary and if dictionary contains all items, check if this subarray is minimum.

16th test showed that my algorithm exceeded time limit, which was 1 second. I think, that my algorithm is O(N), because it finishes in one run across array, and map access costs O(1).

How can one speed up this algorithm? Can be complexity reduced or is it my misunderstanding of some Python which takes much time?


Solution

  • Your algorithm is good but ignoring the time that len(trees) < K, it's O(NK) because every call to min and max is O(K). There's no need to call max because max(trees.values()) == i. Dealing with min is trickier, but if you keep track of which key corresponds to the minimum index then you can recalculate it only when that key is updated.

    A minor point is that your last if doesn't always need to be checked.

    Overall:

    trees = {}
    min_idx = (1, N)
    min_length = N
    first_index = -1
    for i in range(N):
        trees[alley[i]] = i
        if len(trees) == K:
            if first_index == -1 or alley[first_index] == alley[i]:
                first_index = min(trees.values())
            idx = (first_index+1, i+1)
            length = idx[1] - idx[0] + 1
            if length < min_length:
                min_idx = idx
                min_length = length
                if min_length == K:
                    break