Search code examples
calgorithmsumintervals

Execution Time Out (12000 ms) for the task "Sum of intervals" Codewars


The site codewars.com has a task "Sum of intervals". https://www.codewars.com/kata/52b7ed099cdc285c300001cd

The bottom line is to find the sum of the intervals, taking into account the overlap. For example:

sum_intervals((const struct interval[]){
   {1,4},
   {7, 10},
   {3, 5}
}, 3); /* => 7 */

The sum of the numbers on the intervals {1,4}, {7,10}, {3,5} is equal to 7. It should be taken into account that the intervals {1,4} and {3,5} overlap. I'm doing this task in C:

struct interval {
    int first;
    int second;
};
int sum_intervals(const struct interval* intervals, const size_t ints_size)
{
    int seq_size = 0;
    for (unsigned int i = 0; i < ints_size; i++)
        seq_size += intervals[i].second - intervals[i].first;

    int* sequence = malloc(seq_size * sizeof(int));
    int iter = 0;
    
    for (unsigned int i= 0; i < ints_size; i++) {
        int k = intervals[i].second;
        for (int j = intervals[i].first; j < k; j++) {
            sequence[iter] = j;
            iter++;
        }
    }
    int unq_seq_size = seq_size;
    qsort(sequence, seq_size, sizeof(int), compare);

    for (int i = 0; i < seq_size - 1; i++)
    if (sequence[i] == sequence[i + 1]) unq_seq_size--;
    
    free(sequence);
    return unq_seq_size;
}
int compare(const void* x1, const void* x2) {
    return (*(int*)x1 - *(int*)x2); 
}

First, I determine what size array is needed to store all the numbers in the intervals by calculating int seq_size. Then I allocate memory for the int*sequency array, after which I fill it with numbers between the boundaries of each of the intervals. Next, I sort the array, after which, to find the sum, it will be sufficient to compare neighboring elements for equality and, in case of equality, reduce the sum int unq_seq_size.

The code satisfies the tests, but is further considered a failure because "Execution Timed Out (12000 ms)". Help me optimize the code, or suggest another approach?

I calculated the execution time of the function using the following construct:

float startTime = (float) clock()/CLOCK_PER_SEC;
/* Do work */
float endTime = (float) clock()/CLOCK_PER_SEC;
float timeElapsed = endTime - startTime;

As a result, int timeElapsed is equal to 0.004000. Next, I applied this construction to individual blocks and got that all this time is spent on sorting:

float startTime = (float)clock() / CLOCKS_PER_SEC;
qsort(sequence, seq_size, sizeof(int), compare);
float endTime = (float)clock() / CLOCKS_PER_SEC;
float timeElapsed = endTime - startTime;
printf("%f",timeElapsed ); //0.004000

Also, at the end of the assignment there is a similar text: "Random tests" Up to 32 intervals from the range [-10^9, 10^9]. Can these 0.004000 at the interval [-10^9, 10^9] give "Execution Timed Out (12000 ms)"?


Solution

  • You solution is too slow effectively, as it is related to the range of data, which may be huge.

    If n is the number of intervals, here is a O(n logn) solution.

    1. Sort the intervals according to the start of them, and if equal to the end of them

    2. Perform an iterative linear examination of the intervals as follows:

      • sum = 0
      • current_start = interval[0].first
      • current_end = interval[0].second
      • Do i = 1 to n-1
        • if (interval[i].first > current_end) then
          • sum += current_end - current_start
          • current_start = interval[i].first
          • current_end = interval[i].second
        • else
          • current_end = max (current_end, interval[i].second)
      • sum += current_end - current_start