The original question has a list of unsorted list of 2 integers. To simplify this problem let's just consider the input is 2 sorted arrays of integers and an integer target. Value pair can repeat if there are more than 1 solution pair.
For example: [7,8,14],[5,10,14] target: 20 The solution is [14, 5] as 14 from first array and 5 from second array sums 19 which is closest to 20.
My solution was to loop through both array from beginning to end and compare against a tracked minimum difference and update if new difference is smaller.
But this is brute force. Is there any better solution?
Most solutions I found online was to find the target from the same array, is there any similarities between 2 arrays target problem and 1 array?
One key insight: Given a pair (x, y) whose sum is higher than the target, that sum is closer than the sum of any pair (x, y'), where y' > y. Conversely, if the sum of (x, y) is lower than the target, that sum is closer than the sum of any pair (x', y) where x' < x.
This yields an algorithm in linear time:
In Java:
private static Pair<Integer, Integer> findClosestSum(List<Integer> X, List<Integer> Y, int target) {
double bestDifference = Integer.MAX_VALUE;
Pair<Integer, Integer> bestPair = null;
int xIndex = 0;
int yIndex = Y.size() - 1;
while (true) {
double sum = X.get(xIndex) + Y.get(yIndex);
double difference = Math.abs(sum - target);
if (difference < bestDifference) {
bestPair = new Pair<>(X.get(xIndex), Y.get(yIndex));
bestDifference = difference;
}
if (sum > target) {
yIndex -= 1;
if (yIndex < 0) {
return bestPair;
}
} else if (sum < target) {
xIndex += 1;
if (xIndex == X.size()) {
return bestPair;
}
} else {
// Perfect match :)
return bestPair;
}
}
}
You can prove this algorithm is exhaustive through the logic in the starting paragraph. For any pair that wasn't visited, there must be a pair including one of its two elements that was visited, and which has a sum strictly closer to the target.
EDIT: If you only want sums which are less than the target (not those which overshoot), the same logic still applies. In the overshoot case, (x, y') is just as invalid as (x, y), and therefore it cannot be a better candidate sum. In this case, only step 2 needs to be modified, to store the sum only if it's the closest non-exceeding sum so far.