Second or last but one

Walking through the code interviews of IT international companies I run into interesting problem.

How many comparisons do we have to make to figure out what element out of six is the second smallest or the second largest.

None of these six elements have the same value.

we have main function with six arguments (v1, ..., v6)

 def solve(v1, v2, v3, v4, v5, v6):
     # the worst case -> 9 comparisons
     if isLarger(v1, v2):
         v1, v2 = v2, v1
     if isLarger(v1, v3):
         v1, v3 = v3, v1
     if isLarger(v1, v4):
         v1, v4 = v4, v1
     if isLarger(v1, v5):
         v1, v5 = v5, v1
     if isLarger(v1, v6):
         v1, v6 = v6, v1
     if isLarger(v2, v3):
         v2, v3 = v3, v2
     if isLarger(v2, v4):
         v2, v4 = v4, v2
     if isLarger(v2, v5):
         v2, v5 = v5, v2
     if isLarger(v2, v6):
         v2, v6 = v6, v2
     print(f"#comparisons = {CntComparisons}")
     return v2

which returns the second smallest or the second largest value.

Determine this value by comparison (i.e. it cannot use indexing by that value).

For pairwise comparison we can use only the below function

CntComparisons = 0
def isLarger(v1, v2):
    global CntComparisons
    CntComparisons += 1
    return v1 > v2

Values are compared only by calling the comparison function isLarger(v1, v2).

The goal is to find an algorithm that requires (even in the worst case) as few comparisons as possible!

Any ideas or hint how to solve this?

Solution

A trivial, but not optimal way is using the first two passes of bubble sort: this will swap pairs of variables and so bubble the 2 greatest values to the "right" and assign them to v5 and v6, and so v5 will be returned as correct answer. This requires 9 comparisons: 5 in the first pass, 4 in the second. So we have an upper bound of 9 comparisons.

A lower bound is 5, since that is the number of comparisons needed to find either the minimum or the maximum, and that will be needed to be sure a value is the second least or second greatest.

Here is an idea:

Perform 2 to 3 comparisons to sort v1, v2, v3 through swaps from least to greatest. We then know that v2 is not the least nor the greatest.
Perform 3 comparisons between v2 and the last 3 values (v3, v4 and v5).
If these comparisons all give the same boolean result, it means that v2 is indeed the answer.
If among those boolean results there is only one True (i.e. v2 is only greater than one of them), then let vx be the one among v3, v4 or v5 for which v2 is greater than vx. Return the greatest of v1 and vx: this needs another, 7^th comparison.
If among those boolean results there is only one False (i.e. v2 is greater than two of them), then let vx be the one among v3, v4 or v5 for which v2 is not greater than vx. Return the least of v3 and vx: this needs another, 7^th comparison.

So in the worst case we get the result with 7 comparisons. The best case needs 5 comparisons (2 for the sorting step, and c4, c5 and c6).

Here is an implementation of this idea:

def solve(v1, v2, v3, v4, v5, v6):
    # Sort (v1, v2, v3)
    if isLarger(v1, v2):
        v1, v2 = v2, v1
    if isLarger(v2, v3):
        v2, v3 = v3, v2
        if isLarger(v1, v2):
            v1, v2 = v2, v1
    # v2 is now certainly not the greatest nor the least
    # Check how it compares with v4, v5 and v6
    c4 = isLarger(v2, v4)
    c5 = isLarger(v2, v5)
    c6 = isLarger(v2, v6)
    if c4 == c5 == c6:  # All true or all false
        return v2
    if c4 ^ c5 ^ c6:  # Only one of them is true
        vx = v4 if c4 else (v5 if c5 else v6)
        return v1 if isLarger(v1, vx) else vx
    else:  # Only one of them is false
        vx = v4 if not c4 else (v5 if not c5 else v6)
        return vx if isLarger(v3, vx) else v3

I ran this on all permutations of [1,2,3,4,5,6] and these are the results:

5 comparisons needed for 96 permuations
6 comparisons needed for 336 permutations
7 comparisons needed for 288 permutations

On average: 6.266667 comparisons