I had this test earlier today, and I tried to be too clever and hit a road block. Unfortunately I got stuck in this mental rut and wasted too much time, failing this portion of the test. I solved it afterward, but maybe y'all can help me get out of the initial rut I was in.
Problem definition:
An unordered and non-unique sequence A consisting of N integers (all positive) is given. A subsequence of A is any sequence obtained by removing none, some or all elements from A. The amplitude of a sequence is the difference between the largest and the smallest element in this sequence. The amplitude of the empty subsequence is assumed to be 0.
For example, consider the sequence A consisting of six elements such that:
A[0] = 1
A[1] = 7
A[2] = 6
A[3] = 2
A[4] = 6
A[5] = 4
A subsequence of array A is called quasi-constant if its amplitude does not exceed 1. In the example above, the subsequences [1,2], [6,6], and [6,6,7] are quasi-constant. Subsequence [6, 6, 7] is the longest possible quasi-constant subsequence of A.
Now, find a solution that, given a non-empty zero-indexed array A consisting of N integers, returns the length of the longest quasi-constant subsequence of array A. For example, given sequence A outlined above, the function should return 3, as explained.
Now, I solved this in python 3.6 after the fact using a sort-based method with no classes (my code is below), but I didn't initially want to do that as sorting on large lists can be very slow. It seemed this should have a relatively simple formulation as a breadth-first tree-based class, but I couldn't get it right. Any thoughts on this?
My class-less sort-based solution:
def amp(sub_list):
if len(sub_list) <2:
return 0
else:
return max(sub_list) - min(sub_list)
def solution(A):
A.sort()
longest = 0
idxStart = 0
idxEnd = idxStart + 1
while idxEnd <= len(A):
tmp = A[idxStart:idxEnd]
if amp(tmp) < 2:
idxEnd += 1
if len(tmp) > longest:
longest = len(tmp)
else:
idxStart = idxEnd
idxEnd = idxStart + 1
return longest
As Andrey Tyukin pointed out, you can solve this problem in O(n)
time, which is better than the O(n log n)
time you'd likely get from either sorting or any kind of tree based solution. The trick is to use dictionaries to count the number of occurrences of each number in the input, and use the count to figure out the longest subsequence.
I had a similar idea to him, but I had though of a slightly different implementation. After a little testing, it looks like my approach is a quite a bit faster, so I'm posting it as my own answer. It's quite short!
from collections import Counter
def solution(seq):
if not seq: # special case for empty input sequence
return 0
counts = Counter(seq)
return max(counts[x] + counts[x+1] for x in counts)
I suspect this is faster than Andrey's solution because the running time for both of our solutions really take O(n) + O(k)
time where k
is the number of distinct values in the input (and n
is the total number of values in the input). My code handles the O(n)
part very efficiently by handing off the sequence to the Counter
constructor, which is implemented in C. It is likely to be a bit slower (on a per-item basis) to deal with the O(k)
part, since it needs a generator expression. Andrey's code does the reverse (it runs slower Python code for the O(n)
part, and uses faster builtin C functions for the O(k)
part). Since k
is always less than or equal to n
(perhaps a lot less if the sequence has a lot of repeated values), my code is faster overall. Both solutions are still O(n)
though, and both should be much better than sorting for large inputs.