Search code examples
c++arraysmergesort

Arrays and mergeSort in c++?


void Merge(int *array, int lo, int mid, int hi) {
    int tArray[20];
    int loBeg = lo;
    int count = lo;
    int hiBeg = mid + 1;
    while (loBeg <= mid && hiBeg <= hi) {
        if (array[loBeg] < array[hiBeg]) {
            tArray[count] = array[loBeg];
            loBeg++;
            count++;
        } else {
            tArray[count] = array[hiBeg];
            hiBeg++;
            count++;
        }
    }
    while (loBeg <= mid) {
        tArray[count++] = array[loBeg++];
    }
    while (hiBeg <= hi) {
        tArray[count++] = array[hiBeg++];
    }
    for (int i = 0; i < count; i++) {
        array[i] = tArray[i];
    }
}

void mergeSort(int *array, int lo, int hi) {
    if (lo < hi) {
        int mid = (lo + hi) / 2;
        mergeSort(array, lo, mid);
        mergeSort(array, mid + 1, hi);
        Merge(array, lo, mid, hi);
    }
}

int main(int argc, const char * argv[]) {
    int array[] = {90, 99, 63, 82, 93, 76, 81, 76};
    //int temp[8];
    mergeSort(array, 0, 7);
    for (int i = 0; i < 8; i++) {
        std::cout << array[i] << std::endl;
    }
    return 0;
}

My Question is about the nature of arrays during this merge sort code. This code only works if in Merge, O set tArray[20]; to have that initial value of 20. Why can I not set the initial value to be hi + 1 which in this case is 8 (same as array)? However, if i uncomment out the temp[8] array, and pass that through both mergeSort and Merge, and use that as tArray in Merge (with an initial size of 8), then it works.

I think my lack of understanding is also the reason why my original merge() (see below) function does not work either:

    //this was my original merge code which does not work.
    void merge(int *array, int lo, int mid, int hi) {
        int tempArray[hi + 1];
        int loBeg = lo;
        int hiBeg = mid + 1;
        for (int i = 0; i <= hi; i++) {
            if (hiBeg > hi || (array[loBeg] < array[hiBeg] && loBeg <= mid)) {
                tempArray[i] = array[loBeg++];
            } else {
                tempArray[i] = array[hiBeg++];
            }
        }
        for (int i = 0; i <= hi; i++) {
            array[i] = tempArray[i];
        }
    }

So Basically I am wondering why in the first Merge() function, I have to set the tArray initial size to be 20 instead of the same size as array, unless I pass along a temp array from main (which is initialized to be the same size as array) and then furthermore, why my original merge() function does not work, but I think it has to do with my lack of understanding of the first Merge() function.


Solution

  • There are multiple issues in your code, here are a few of them:

    void Merge(int *array, int lo, int mid, int hi) {
        // The size should be hi - lo + 1, not hi + 1
        int tArray[hi + 1];
        int loBeg = lo;
        // count should start at 0, not lo
        int count = lo;
        int hiBeg = mid + 1;
        while (loBeg <= mid && hiBeg <= hi) {
            if (array[loBeg] < array[hiBeg]) {
                tArray[count] = array[loBeg];
                loBeg++;
                count++;
            } else {
                tArray[count] = array[hiBeg];
                hiBeg++;
                count++;
            }
        }
        while (loBeg <= mid) {
            tArray[count++] = array[loBeg++];
        }
        while (hiBeg <= hi) {
            tArray[count++] = array[hiBeg++];
        }
        for (int i = 0; i < count; i++) {
            // it should be array[lo + i], not array[i]
            array[i] = tArray[i];
        }
    }
    

    Here are some explanations for the 3 mistakes:

    1. You don't want an array of size hi + 1 each time, when you go down the "merging tree", you have to merge array of lower size, so you need to use hi - lo + 1. Actually, using hi + 1 should not be an issue in this case, but you are allocating more memory than you really need.

    2. You should not start at lo but rather at 0, or your tArray[count] will be out-of-bound. Start at lo worked in your case because you were allocating a very big array (of size 20 or hi + 1), but it won't work with an array of size hi - lo + 1, so you should be careful.

    3. Maybe the biggest mistake - You were always replacing the first count cell of your original array... No! You want to replace the cell between lo and hi, not between 0 and count.

    You need to remember that when you call Merge (array, lo, mid, hi), you want to merge the array[lo:mid] and array[mid+1:hi] into array[lo:hi].


    Here are some detailed explanation:

    Let's start with the following array [90, 99, 63, 82, 93, 76, 81, 76], at each call to mergeSort, you are dividing the array in 2 equal parts:

    First time you get [90, 99, 63, 82] (lo = 0, hi = 3), and [93, 76, 81, 76] (lo = 4 and hi = 7), and you keep dividing until you have 8 arrays of size 1. I will write (array, lo, hi), e.g. ([90], 0, 0) for simplicity.

    You should get to a point where you have ([90], 0, 0), ([99], 1, 1), and so on... You will merge these 2 by 2, so you are (for instance) going to merge ([93], 4, 4) and ([76], 5, 5). When you merge these 2 arrays, you get a call of Merge like the following:

    Merge (array, 4, 4, 5) // mid is (4 + 5) / 2 = 4
    

    So, what happens in the Merge function?

    1. You should have noticed that since you are merging two arrays of size 1 ([93] and [76]), you need a temporary array of size 2. If you use tArray[hi + 1], you allocate an array of size hi + 1 = 6, which is a lot bigger than what you actually need (may not make a big difference here, but imagine if you have an original array of size one billion!). So allocating an array of size hi - lo + 1 = 5 - 4 + 1 = 2 is sufficient.

    2. You are actually is the range 4 to 5 of your initial array, so you want to work on that part, you don't want to erase the part 0 to 1 that you have already sorted! But what happens if you do array[i] = tArray[i] in your loop? Well, i starts at 0 and goes to count (which should be 2 in this case), so you are replacing array[0] and array[1] instead of array[4] and array[5]. This can be fixed by doing array[lo + i] instead of array[i].