Search code examples
algorithmdata-structuresbinary-search

FInd the Single Element in a sorted array where every element occurs twice except one | Data Structures & Algorithms


I am trying to understand the solution of this problem given in this link. It solves the problem in O(logN) time.

// A Binary Search based function to find the element 
// that appears only once 
void search(int *arr, int low, int high) 
{ 
     // Base cases 
    if (low > high) 
       return; 

    if (low==high) 
    { 
        printf("The required element is %d ", arr[low]); 
        return; 
    } 

    // Find the middle point 
    int mid = (low + high) / 2; 

    // If mid is even and element next to mid is 
    // same as mid, then output element lies on 
    // right side, else on left side 
    if (mid%2 == 0) 
    { 
        if (arr[mid] == arr[mid+1]) 
            search(arr, mid+2, high); 
        else
            search(arr, low, mid); 
    } 
    else  // If mid is odd 
    { 
        if (arr[mid] == arr[mid-1]) 
            search(arr, mid+1, high); 
        else
            search(arr, low, mid-1); 
    } 
}

I am able to understand the approach. But I am not able to understand why when mid is even and arr[mid] != arr[mid + 1]. why are we doing high = mid instead of high = mid - 1. I have found the test case which takes it to the infinite loop. But, I am not able to understand a good reason why we are including mid again, even though we checked if above with mid + 1

It would be very helpful if someone can explain the clear reason why we are using high = mid instead of high = mid - 1.

Here is the test case which takes it to infinite loop.

[1,1,2,3,3,4,4,8,8]

Solution

  • when mid is even and arr[mid] != arr[mid + 1], then the unique value can be the mid itself. So you have to run search again from [low, mid]. For example, consider this array

    [1,1,2,3,3]
    

    In the first pass, mid is even and arr[mid] != arr[mid + 1]. But if you only run search on [low, mid-1], you will never find the unique number, because the number is mid itself.