Max function using divide and conquer approach is slower than linear?

I implemented the max function (given an array, find the maximum value) using two functions: max, which is O(n) and mergeMax, which I expect is O(lg n) by using the divide and conquer approach. I was expecting mergeMax to win out as n increased but I was surprised that it was beaten every time by the max function. Am I wrong in saying that mergeMax is O(lg N)? Here's the code in C:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

int maxValue = 0;

void mergeMax(int items[], int low, int high)
{
    if (high == low)
    {
        if (items[low] > maxValue) maxValue = items[low];
        return;
    }
    if (high - low == 1)
    {
        if (items[low] > items[high])
        {
            if (items[low] > maxValue) maxValue = items[low];
        }
        else
        {
            if (items[high] > maxValue) maxValue = items[high];
        }
        return;
    }
    int middleValue = (low + high) / 2;
    mergeMax(items, low, middleValue);
    mergeMax(items, middleValue + 1, high);
}

int max(const int items[], int itemCount)
{
    int max = 0;
    for (int i = 0; i < itemCount; i++)
    {
        if (items[i] > max) max = items[i];
    }
    return max;
}
int main(void)
{
    int itemCount = 2000000;
    printf("Generating...\n");
    int items[itemCount];
    srand(time(0));
    for (int i = 0; i < itemCount; i++)
    {
        items[i] = rand() % ((4294967295/2) + 1);
    }
    struct timespec start, end;

    clock_gettime(CLOCK_MONOTONIC_RAW, &start);
    mergeMax(items, 0, itemCount - 1);
    clock_gettime(CLOCK_MONOTONIC_RAW, &end);
    uint64_t delta_us = (end.tv_sec - start.tv_sec) * 1000000 + (end.tv_nsec - start.tv_nsec) / 1000;
    printf("O(lg n) time in microseconds: %lu. Result: %d\n", delta_us, maxValue);

    clock_gettime(CLOCK_MONOTONIC_RAW, &start);
    int maximum = max(items, itemCount);
    clock_gettime(CLOCK_MONOTONIC_RAW, &end);
    delta_us = (end.tv_sec - start.tv_sec) * 1000000 + (end.tv_nsec - start.tv_nsec) / 1000;
    printf("O(n) time in microseconds: %lu. Result: %d\n", delta_us, maximum);
}

Solution

Consider the case where there are 1024 items. When mergeMax is called with 1024 items, it will divide them into two segments and call mergeMax twice, each with 512 items.

Those two mergeMax calls will call mergeMax two times each, so four times total, each with 256 items. Those four calls will call mergeMax two times each, so eight times total, each with 128 items. Those eight calls will generate 16 calls, each with 64 items. Those 16 calls will generate 32 calls, each with 32 items. Those 32 calls will generate 64 calls, each with 16 items. Those 64 calls will generate 128 calls, each with 8 items. Those 128 calls will generate 256 calls, each with 4 items. Those 256 calls will generate 512 calls, each with 2 items.

At 2 items, mergeMax handles the two items directly and does not generate any more calls.

In each of those 512 calls, 2 items are considered for the maximum. So the amount of work being done to find the maximum is still 1024 tests. Nothing was saved.

Furthermore, there were 512 final calls, 256 before that, 128 before that, then 64, 32, 16, 8, 4, 2, and 1. In total, that is 1023 calls to mergeMax, 1022 more than simply calling max. So, in addition to the 1024 tests, you added the overhead of 1022 function calls.

So of course mergeMax is slower.

There is nothing logarithmic about mergeMax except the call depth, which is how many nested calls there are at one time.