Search code examples
scalabinary-searchminimum

How can I find the minimum with binary search on an interval in Scala?


Okay so im trying to find the minimum of an given interval with binary search in Scala, but the minimum has to match a also given function. Heres my code:

def binarySearch: (Int => Boolean) => (Int, Int) => Int = f => (l, h) => {
  def bs: ((Int => Boolean) => (Int, Int, Int) => Int) = f => (l, h, minimum) => {

    val mid = l + ((h-l) / 2)

    mid match{
      case mid if(f(mid) == false) => bs(f)(mid+1, h, mid)
      case mid if(f(mid) == true && mid > minimum) => bs(f)(l, mid-1, minimum)
      case mid if(f(mid) == true && mid < minimum) => bs(f)(mid+1, h, mid)
      case mid => mid
    }
  }
  bs(f)(l, h, 0)
}

I think my problem is, that i´m not correctly saving the minimum.

A testcase could look like this:

val v = Vector(0, 1, 2, 3, 7, 9)
binarySearch(v(_) < 5)(0, v.length) == 4

Any ideas?


Solution

  • That style of declaring methods is very unusual and makes your code difficult to read and understand. So, my first step is to reproduce your code in a more idiomatic, conventional form.

    The steps I took are as follows:

    1. It's preferable, IMHO, for functions to accept arguments, rather than to return anonymous functions that take arguments. It's far simpler and cleaner and makes your intentions clearer.
    2. It's conventional to name predicate functions—those that typically take a single argument and return a Booleanp.
    3. In bs, the redeclaration of its predicate function argument is redundant; it can re-use the predicate supplied to binarySearch, and so does not need to redeclare it.
    4. When testing Boolean values, it's conventional to let the value stand. That is, if boolExpr is a Boolean expression having the value true or false, you should prefer if(boolExpr) to if(boolExpr == true) and if(!boolExpr) to if(boolExpr == false).
    5. A default case, case _ can be used if none of the preceding cases matched. This is a little clearer than case mid in your code.

    Your original code then becomes:

    def binarySearch(p: Int => Boolean)(l: Int, h: Int): Int = {
    
      def bs(l: Int, h: Int, minimum: Int): Int = {
    
        val mid = l + ((h - l) / 2)
    
        mid match {
          case mid if(!p(mid)) => bs(mid+1, h, mid)
          case mid if(p(mid) && mid > minimum) => bs(l, mid-1, minimum)
          case mid if(p(mid) && mid < minimum) => bs(mid+1, h, mid)
          case _ => mid
        }
      }
      bs(l, h, 0)
    }
    

    OK, so now we come to debugging your function. You claim that it should calculate the minimum of a given interval and that the minimum has to match a also given function. Or, put another way, it should find the minimum value in a range that satisfies a predicate.

    However, in your test case, your predicate is that the value must be less than 5, and you expect the value 4 as the answer.

    Some problems are clear:

    1. Your code assumes, but doesn't verify, that the data is ordered from low to high. I'm just mentioning this because, clearly, the code will fail if the data isn't so ordered. If you can guarantee that the assumption holds, then that's fine.
    2. l, h, mid and minimum are all indices of values belonging to a Vector that is inaccessible from binarySearch—but they are not themselves values. This contradicts your statement that you're looking for the minimum value; rather, it looks like you're looking for the index of the minimum value.
    3. In your test condition, you expect the value 4, which is the index of the value 7. However, 7 fails the predicate as it is not less than 5. This leads me to suspect that your predicate function in your test should be binarySearch(v(_) >= 5)(0, v.length).
    4. You do not verify that l is less than h in binarySearch, indicating that you have no range to search. If that happens in bs, then you should treat it as a termination condition (the range has been fully searched) and return the minimum value index found. (You don't appear to have such a termination condition in your existing code, so it can loop infinitely under certain circumstances, ultimately resulting in a StackOverflowError.)
    5. You should note that binary search with a predicate function is fundamentally flawed, unless the predicate splits the range such that the indices of all values that fail the predicate are consecutive and appear at the beginning of the range, and that the indices of all values that pass the predicate are consecutive at the end of the range. To illustrate this, consider what would happen if the predicate only accepts even values: binarySearch(v(_) % 2 == 0)(0, v.length)? (Hint: Your function would need to visit every element in the range to guarantee finding the mininum, and that is not what a binary search does.)
    6. If your search can't find a minimum value that satisfies the predicate, then it is unable to signal the fact. For this reason, I think your function ought to return Option[Int] with the value None being returned if no value can be found, and Some(minimum) otherwise.
    7. Passing v.length for h will cause an IndexOutOfBoundsException to be thrown if the value of the element with that index is examined. You should pass v.length - 1 instead, or treat h as one position beyond the end of the range.

    UPDATE

    In order to address your implementation, I've re-structured the problem slightly. Now, the binarySearch function uses a binary search to find the lowest value that is greater than a specified minimum. To do this, instead of a predicate function with a low and high index, it accepts an IndexedSeq, named seq and a minimum value that can be accepted.

    // Return None if no value found, or Some(min index) otherwise.
    def binarySearch(seq: IndexedSeq[Int], minimum: Int): Option[Int] = {
    
      // Helper function. Check middle value in range. Note: h is one position beyond end of
      // current range.
      def bs(l: Int, h: Int, min: Option[Int]): Option[Int] = {
    
        // If the l index isn't less than the h index, then we have nothing more to search for.
        // Return whatever our current minimum is.
        if(l >= h) min
    
        // Otherwise, find the middle index value of this range.
        else {
          val mid = l + ((h - l) / 2)
          assert(mid >= l && mid < h) // Sanity check.
    
          // Determine if the middle value is large enough.
          if(seq(mid) >= minimum) {
    
            // OK. So this value qualifies. Update our minimum. Any potentially lower values are
            // going to be in the range below mid, so look there next.
            val nextMin = min.filter(_ < mid).orElse(Some(mid))
            bs(l, mid, nextMin)
          }
    
          // No luck. Search the range above mid with the same minimum.
          else bs(mid + 1, h, min)
        }
      }
    
      // Start the search. Our initial minimum value is None.
      bs(0, seq.length, None)
    }
    

    Here's some examples in the Scala REPL:

    scala> val v = Vector(0, 1, 2, 3, 7, 9)
    v: scala.collection.immutable.Vector[Int] = Vector(0, 1, 2, 3, 7, 9)
    
    scala> binarySearch(v, 5)
    res0: Option[Int] = Some(4)
    
    scala> binarySearch(v, 9)
    res1: Option[Int] = Some(5)
    
    scala> binarySearch(v, 50)
    res2: Option[Int] = None
    
    scala> binarySearch(v, -1)
    res3: Option[Int] = Some(0)