Search code examples
calgorithmpointersmemory-managementdynamic-memory-allocation

How to properly manage memory in the following C code?


I recently started learning C programming (I have a bit of experience in Python around intermediate level(?)). I saw this problem named "Median of two sorted arrays" on LeetCode and thought I might try implementing the code for that using C. The code consists of three functions and since the problem lies with the entirety of the code, in particular with memory management(? please do help if there are any others), I am going to post the entire thing.

int* insert(int* arr, int n, int x)
{
    int *ret = (int *) calloc(n+1,sizeof(int)), cont = 0;
    for(int i = 0; i<n && *(arr+i) < x; i++)
    {
        *(ret + i) = *(arr + i);
        cont++;
    }
    *(ret+cont) = x;
    for(int i = cont; i < n; i++)
    {
        *(ret+i+1) = *(arr + i);
    }
    return ret;
}
int* merge(int* arr1, int len1, int* arr2, int len2)
{
    int *ret = (int *) calloc(len1 + len2, sizeof(int)), **temp = (int **) malloc((len2+1)*sizeof(int *));
    for(int i = 0; i < len1; i++)
    {
        *(ret+i) = *(arr1+i);
    }
    *temp = ret;
    for(int i = 0; i < len2; i++)
    {
        *(temp+i+1) = insert(*(temp+i),len1+i, *(arr2+i));
    }
    for(int i = 0; i < len1+len2; i++)
    {
        *(ret + i) = *(*(temp+len2-1)+ i);
    }
    for(int i = 0; i < len2; i++)
    {
        free(*(temp+i));
    }
    free(temp);
    return ret;
}
double findMedianSortedArrays(int* nums1, int nums1Size, int* nums2, int nums2Size)
{
    int *arr = merge(nums1, nums1Size, nums2, nums2Size), len = nums1Size+ nums2Size;
    double mid;
    if(len % 2 == 1)
    {
        mid = *(arr + len/2);
    }
    else
    {
        mid = (*(arr+len/2)+*(arr+len/2+1))/2.0;
    }
    free(arr);
    return mid;
}

The merge and insert algorithms I believe implementation of insertion sort(?), I know that the code can be improved alot efficiency wise. But my problem lies elsewhere (?) as the following error message pops up on LeetCode:

=================================================================
==22==ERROR: AddressSanitizer: heap-use-after-free on address 0x602000000054 at pc 0x56200c01d257 bp 0x7fffc24d3340 sp 0x7fffc24d3330
READ of size 4 at 0x602000000054 thread T0
    #2 0x7f623c602082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082)
0x602000000054 is located 4 bytes inside of 12-byte region [0x602000000050,0x60200000005c)
freed by thread T0 here:
    #0 0x7f623d010537 in __interceptor_free ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:127
    #4 0x7f623c602082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082)
previously allocated by thread T0 here:
    #0 0x7f623d010a57 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #4 0x7f623c602082 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x24082)
Shadow bytes around the buggy address:
  0x0c047fff7fb0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x0c047fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
=>0x0c047fff8000: fa fa 00 fa fa fa 04 fa fa fa[fd]fd fa fa fd fd
  0x0c047fff8010: fa fa 00 04 fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
  Shadow gap:              cc
==22==ABORTING

which I think is pointing to memory leak, though I'm not versed at all so can't say for sure. Please look into the code and error and suggest how to overcome this.

I have tried numerous times to correct what I think of as the problem (for example initially it was merely *temp instead of **temp = (int **) malloc((len2+1)*sizeof(int *)) which was impossible to deallocate. But the problem persists, and I don't know what to do or search. Any help will be appreciated. Thanks!


Solution

  • *(ret + i) = *(*(temp + len2 - 1) + i); in your merge function accesses the array out of bounds. It does the same as:

    *( temp[len2 - 1] + i )
    

    That is, it takes the value from the temp array at position len2 - 1 and adds i to that address which is then dereferenced. Since it's out of bounds, it has undefined behavior and your program could crash, or worse.

    A suggested fix that does the merging directly instead of having to call insert:

    int* merge(int arr1[], size_t len1, int arr2[], size_t len2) {
        int * const ret = malloc((len1 + len2) * sizeof(int));
        int *wr = ret;
    
        // merge the two arrays while they both have values
        for(;len1 && len2; ++wr) {
            if(*arr1 < *arr2) {            
                *wr = *arr1++;
                --len1;
            } else {
                *wr = *arr2++;
                --len2;
            }
        }
        // here either len1 or len2 or both are zero
    
        // fill with any residue from arr1
        while(len1--) *wr++ = *arr1++;
        
        // fill with any residue from arr2
        while(len2--) *wr++ = *arr2++;
        
        return ret;
    }
    

    I also would like to promote pointer[index] over *(pointer + index) in general. Example:

    double findMedianSortedArrays(int nums1[], size_t nums1Size,
                                  int nums2[], size_t nums2Size) 
    {
        int *arr = merge(nums1, nums1Size, nums2, nums2Size);
        size_t len = nums1Size + nums2Size;
    
        double mid;
        if (len % 2 == 1) {
            mid = arr[len / 2];
        } else {
            mid = (arr[len / 2] + arr[len / 2 - 1]) / 2.0;
        }
        free(arr);
        return mid;
    }
    

    Note: You need - 1 in the even case. For two elements, 2 / 2 == 1 and you want element 0 and 1.

    Demo


    You may also want to take a shortcut if one array is empty. In that case, you do not need to allocate any memory at all and can lookup the median value directly from the array that does have values. Example:

    // may only be called if both arrays have length > 0 or else UB
    int *merge(int arr1[], size_t len1, int arr2[], size_t len2) {
        int *const ret = malloc((len1 + len2) * sizeof(int));
        int *wr = ret;
    
        // merge the two arrays
        for (;;) {
            if (*arr1 < *arr2) {
                *wr++ = *arr1++;
                if (--len1 == 0) {
                    // fill with the rest from arr2
                    while (len2--) *wr++ = *arr2++;
                    break;
                }
            } else {
                *wr++ = *arr2++;
                if (--len2 == 0) {
                    // fill with the rest from arr1
                    while (len1--) *wr++ = *arr1++;
                    break;
                }
            }
        }
        return ret;
    }
    

    Note that the merge function now requires both arrays to have values. In findMedianSortedArrays you then check if a merge is needed:

    #include <math.h> // NAN
    
    double getMedian(int arr[], size_t len) {
        return (len % 2 == 1) ? (double)arr[len / 2]
                              : (arr[len / 2] + arr[len / 2 - 1]) / 2.0;
    }
    
    double findMedianSortedArrays(int nums1[], size_t nums1Size, int nums2[],
                                  size_t nums2Size)
    {
        if (nums1Size && nums2Size) { // both arrays have values, merge them
            int *arr = merge(nums1, nums1Size, nums2, nums2Size);
            size_t len = nums1Size + nums2Size;
            double mid = getMedian(arr, len);
            free(arr);
            return mid;
        } else if (nums1Size) { // only nums1 have values
            return getMedian(nums1, nums1Size);
        } else if (nums2Size) { // only nums2 have values
            return getMedian(nums2, nums2Size);
        }
        return NAN; // both arrays have length 0
    }
    

    Demo