Search code examples
phpmathmathematical-optimization

Optimal way of cycling through 1000's of values


I need to find the value of x where the variance of two results (which take x into account) is the closest to 0. The problem is, the only way to do this is to cycle through all possible values of x. The equation uses currency, so I have to check in increments of 1 cent.

This might make it easier:

$previous_var = null;
$high_amount = 50;

for ($i = 0.01; $i <= $high_amount; $i += 0.01) {
    $val1 = find_out_1($i);
    $val2 = find_out_2();

    $var = variance($val1, $val2);

    if ($previous_var == null) {
        $previous_var = $var;
    }

    // If this variance is larger, it means the previous one was the closest to 
    // 0 as the variance has now started increasing
    if ($var > $previous_var) {
        $l_s -= 0.01;
        break;
    }
}

$optimal_monetary_value = $i;

I feel like there is a mathematical formula that would make the "cycling through every cent" more optimal? It works fine for small values, but if you start using 1000's as the $high_amount it takes quite a few seconds to calculate.


Solution

  • Based on the comment in your code, it sounds like you want something similar to bisection search, but a little bit different:

    function calculate_variance($i) {
      $val1 = find_out_1($i);
      $val2 = find_out_2();
    
      return variance($val1, $val2);
    }
    
    function search($lo, $loVar, $hi, $hiVar) {
      // find the midpoint between the hi and lo values
      $mid = round($lo + ($hi - $lo) / 2, 2);
      if ($mid == $hi || $mid == $lo) {
        // we have converged, so pick the better value and be done
        return ($hiVar > $loVar) ? $lo : $hi;
      }
      $midVar = calculate_variance($mid);
      if ($midVar >= $loVar) {
        // the optimal point must be in the lower interval
        return search($lo, $loVar, $mid, $midVar);
      } elseif ($midVar >= $hiVar) {
        // the optimal point must be in the higher interval
        return search($mid, $midVar, $hi, $hiVar);
      } else {
        // we don't know where the optimal point is for sure, so check
        // the lower interval first
        $loBest = search($lo, $loVar, $mid, $midVar);
        if ($loBest == $mid) {
          // we can't be sure this is the best answer, so check the hi
          // interval to be sure
          return search($mid, $midVar, $hi, $hiVar);
        } else {
          // we know this is the best answer
          return $loBest;
        }
      }
    }
    
    $optimal_monetary_value = search(0.01, calculate_variance(0.01), 50.0, calculate_variance(50.0));
    

    This assumes that the variance is monotonically increasing when moving away from the optimal point. In other words, if the optimal value is O, then for all X < Y < O, calculate_variance(X) >= calculate_variance(Y) >= calculate_variance(O) (and the same with all > and < flipped). The comment in your code and the way have you have it written make it seem like this is true. If this isn't true, then you can't really do much better than what you have.

    Be aware that this is not as good as bisection search. There are some pathological inputs that will make it take linear time instead of logarithmic time (e.g., if the variance is the same for all values). If you can improve the requirement that calculate_variance(X) >= calculate_variance(Y) >= calculate_variance(O) to be calculate_variance(X) > calculate_variance(Y) > calculate_variance(O), you can improve this to be logarithmic in all cases by checking to see how the variance for $mid compares the the variance for $mid + 0.01 and using that to decide which interval to check.

    Also, you may want to be careful about doing math with currency. You probably either want to use integers (i.e., do all math in cents instead of dollars) or use exact precision numbers.