Search code examples
javaarraysalgorithmarray-merge

Check if an array is a merge of 2 arrays


Implement a function checking if a given array can be constructed as a merge of the 2 other arrays in any way.

public static boolean isMerge(int[] arr1, int[] arr2, int[] merge){
   //...
}

Examples:

isMerge([3, 1, 2, 2], [2, 1, 1], [3, 1, 2, 2, 2, 1, 1]) -- true

isMerge([1, 2, 3], [4, 5, 6], [1, 2, 3, 4, 5, 6]) -- true

isMerge([1, 2, 3], [4, 5, 6], [1, 4, 5, 2, 3, 6]) -- true

isMerge([1, 2], [2, 3], [1, 2, 3, 2]) -- true

isMerge([1, 2], [3, 4], [1, 2, 3, 4, 5]) -- false

isMerge([1, 2], [3, 4], [1, 2, 5, 3, 4]) -- false

isMerge([1, 2], [3, 4], [5, 1, 2, 3, 4]) -- false


My first thought was to implement a solution with 3 iterators. Iterating through each of the arrays check if the current element in either arr1 or arr2 matches the element of merge.

If so then move to the next element keeping iterating until the mismatch is found or it's proven that the result can be constructed by merging arr1 and arr2.

But the solution fails on isMerge([1, 2], [2, 3], [1, 2, 3, 2]) and reports false instead.

I'm looking for the most efficient solution in terms of time and memory, but would be interested to know of any working approach.


Solution

  • Here is how I would do it. This is a relatively simple approach using two Maps that simply count the occurrences of values in both the two arrays (combined) and the merge array. This can be done in one simple loop over the longest of the three arrays. That is only of course if we don't have more values in the merge array then in the two other arrays combined. If that's the case we can immediately return false as there is no way the array was merged from the two. I also return false if merge is empty and any of the other two arrays is not.

    When we have counted the values we just need to compare keys and values of both maps. If all keys and their respective values match, the arrays can be created by merging both.

    The runtime is O(n) with two loops over n so roughly 2n + k with n being the number of elements in the biggest provided array and k being a small constant (depending on how you count one operation) for all the operations other than loops that are happening in the function.

    In terms of memory we have a, b, n for the length of arr1, arr2, merge (any one of them could have any of the lengths, but for this calculation we assume a = arr1.length, b = arr2.length and n = merge.length). Then we have as memory requirement for the two Maps:

    • for merge: (n * 0.75 + 1)*2
    • for arr1 and arr2: ((a + b) * 0.75 + 1)*2

    The next power of 2 will be used by Java internally for the capacity of the array so in worst case we need double the amount of space then actually required to store the values, hence the *2.

    See code from Java HashMap that determines size of backing array:

    /**
    * Returns a power of two size for the given target capacity.
    */
    static final int tableSizeFor(int cap) {
        int n = -1 >>> Integer.numberOfLeadingZeros(cap - 1);
        return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
    }
    

    Assuming you have n elements in merge and n/2 elements in arr1 and arr2 a the memory required for the maps in worst case would be (n * 0.75 + 1)*2 + ((n/2 + n/2) * 0.75 + 1)*2 which equals 4 * (n * 0.75 + 1) = 3n + 4. You could additionally add the space required for local variables to this, but they are quite insignificant really.

    All in all this approach has O(n) runtime and is therefore asympotically optimal as you will have to "look" at each value once, though there might be implementations with (significantly) smaller constants.

    In terms of memory there certainly are implementations that take much less than this, but most likely memory isn't a big concern on modern hardware for Integer arrays.

    import java.util.*;
    
    public class Application {
        public static void main(String[] args) {
            System.out.println(isMerge(new int[]{3, 1, 2, 2}, new int[]{2, 1, 1}, new int[]{3, 1, 2, 2, 2, 1, 1}));
            System.out.println(isMerge(new int[]{1, 2, 3}, new int[]{4, 5, 6}, new int[]{1, 2, 3, 4, 5, 6}));
            System.out.println(isMerge(new int[]{1, 2, 3}, new int[]{4, 5, 6}, new int[]{1, 4, 5, 2, 3, 6}));
            System.out.println(isMerge(new int[]{1, 2}, new int[]{2, 3}, new int[]{1, 2, 3, 2}));
            System.out.println(isMerge(new int[]{1, 2}, new int[]{3, 4}, new int[]{1, 2, 3, 4, 5}));
            System.out.println(isMerge(new int[]{1, 2}, new int[]{3, 4}, new int[]{1, 2, 5, 3, 4}));
            System.out.println(isMerge(new int[]{1, 2}, new int[]{3, 4}, new int[]{5, 1, 2, 3, 4}));
        }
    
        public static boolean isMerge(int[] arr1, int[] arr2, int[] merge) {
            // early out if we have less values in arr1 + arr2 then in merge or if merge is empty and any of the other is not
            // this could be changed to arr1.length + arr.length !0 merge.length when you don't want to allow this: arr1 = [1, 2], arr2=[3,4,5] and merge=[1,2,3] to return true. It does also make calculating the space easier and will reduce the average case runtime drastically for random inputs
            if (arr1.length + arr2.length < merge.length || (merge.length == 0 && (arr1.length != 0 || arr2.length != 0))) return false;
            // prevent possible rehashing by assigning maximum amount of possible values in the map divided by load factor but also use little memory as possible
            // one could change the load factor: increase -> more performance, more space or decrease -> less performance, less space and measure the performance
            // the calculation for the expected Map size is done like this in Guava and JDK8
            var twoArrValCount = new HashMap<Integer, Integer>((int)((float)(arr1.length + arr2.length) / 0.75f + 1.0f));
            var mergeValCount = new HashMap<Integer, Integer>((int)((float)merge.length / 0.75f + 1.0f));
    
            // determine longest array
            var longestOverall = Math.max(arr1.length, arr2.length);
            longestOverall = Math.max(longestOverall, merge.length);
    
            // count values in merge array and in two arrays combined
            for (int i = 0; i < longestOverall; i++) {
                // add 1 as count if its key is not present yet, add one to current value otherwise
                if (i < arr1.length) twoArrValCount.compute(arr1[i], (k, v) -> (v == null) ? 1 : v + 1);
                if (i < arr2.length) twoArrValCount.compute(arr2[i], (k, v) -> (v == null) ? 1 : v + 1);
                if (i < merge.length) mergeValCount.compute(merge[i], (k, v) -> (v == null) ? 1 : v + 1);
            }
    
            // compare both maps: if all values match return true, return false otherwise
            return mergeValCount
                    .entrySet()
                    .stream()
                    .allMatch(entry -> {
                        // if map2 does not contain a key that is present in map1 -> return false
                        if (!twoArrValCount.containsKey(entry.getKey())) return false;
                        // return result of comparison: if match -> return true, if no match -> return false
                        // if you want to return true for e.g. arr1 = [1, 2], arr2=[3,4,5] and merge=[1,2,3]
                        return twoArrValCount.get(entry.getKey()) <= entry.getValue();
                        // if you want to return false for e.g. arr1 = [1, 2], arr2=[3,4,5] and merge=[1,2,3]
                        // return Objects.equals(twoArrValCount.get(entry.getKey()), entry.getValue())
                    });
        }
    }
    

    Expected output:

    true
    true
    true
    true
    false
    false
    false