Search code examples
c++memcmp

What does memcmp do if you pass two identical pointers as inputs?


I am comparing two byte arrays with memcmp (or rather a library function does that). The arrays can become relatively large, and they can actually be the same array in many cases.

Would it make sense to write something like this, or will memcmp already do that internally?

int memcmp_wrapper(const void* lhs, const void* rhs, std::size_t count) {
    if (lhs == rhs)
        return 0;
    return std::memcmp(lhs, rhs, count);
}

Solution

  • What does memcmp do if you pass two identical pointers as inputs?

    It will return 0.

    will memcmp already [return early if pointers are equal]?

    It is not specified by the standard. A version of glibc that I checked for example does not.

    Would it make sense to write something like this

    Potentially, if the array is large enough.

    What would you consider large enough,

    I would consider the array to be large enough when you have measured memcmp_wrapper to be faster than memcmp by a factor that statistically significant compared to the variance of the measurements.

    Some considerations, among many, for measuring are:

    • the size threshold can be different across different systems depending on CPU, cache and memory etc. See What is a "cache-friendly" code? for an in-depth discussion.

    • Also note that if the optimiser can prove the equality of the pointers at compile time, then it may be smart enough to optimise the memcmp away entirely, and you may end up measuring two programs that do nothing so design your test harness with care.

    and why does it only make sense for that size?

    The branch is not free. The time that you may save by not comparing the array must overcome the expense of the added check.

    Since the cost of comparing the array increases (linear asymptotic complexity) with the size of the array, there must be some length after which any comparison will be slower than the branch.