Search code examples
clanguage-lawyerundefined-behaviorqsort

qsort with size 0 undefined?


I have a report, unconfirmed by me but from a reliable source, that the code

qsort(a, n, sizeof *a, cmpfunc);

is compiled by a modern version of gcc as if it had been written

if(n == 0)
    __builtin_trap();
qsort(a, n, sizeof *a, cmpfunc);

Evidently it is believed that calling qsort with n == 0 is undefined behavior.

[Edit: The whole premise here was found to be false; see "Update 2" below.]

It has been pointed out that Posix explicitly blesses the n == 0 case, but evidently no extant version of the C Standard does.

So the obvious questions are:

  1. Is calling qsort with n = 0 actually undefined behavior in C?
  2. Is every program which ever calls qsort with arbitrary n truly obliged to check for n == 0 and not call qsort in that case?
  3. Why would gcc perform this "optimization"? Even if you believe that calling qsort with n == 0 is undefined, this would seem to marginally slow down every non undefined program.

Textbook implementations of quicksort (which, I know, qsort is not required to be) pretty much can't not handle n = 0 correctly. I wonder if gcc's behavior here is trying to guard against a qsort implementation that somehow does something much worse than a __builtin_trap if the initial call has n == 0?


Update: Thanks for the responses so far. It sounds like gcc is in the wrong here. As I said, I haven't confirmed this result myself, but I'm trying to find out which version of gcc and with which optimization flags the issue was observed.


Update 2: The original report I referred to was in error. Two key clarifications:

  1. gcc was in fact checking for a == 0, not n == 0. This is obviously a completely different kettle of fish: As this thread (and others) has confirmed, calling qsort on a null pointer is considerably more problematic, and almost certainly formally undefined.
  2. The compilation in question included the -fsanitize=undefined and -fsanitize-undefined-trap-on-error flags, so of course gcc was being stringent about checking for inadvertent null pointers (and even at a cost to efficiency).

Sorry for the misinformation and runaround. I'm afraid this question is now in the realm of "not reproducible or was caused by a typo", and I have put one close vote in the hopper on that basis.

For what it's worth, the gcc version was 12.2.1.


Solution

    1. Is calling qsort with n = 0 actually undefined behavior in C?

    It is well-defined behavior in every version of the language.

    1. Is every program which ever calls qsort with arbitrary n truly obliged to check for n == 0 and not call qsort in that case?

    The application programmer's source need not perform any such check. As for the behavior of the generated program, the qsort library function shouldn't call the comparison function internally, so it's essentially the same thing as not calling qsort at all, equivalent to a no-op.

    Why would gcc perform this "optimization"? Even if you believe that calling qsort with n == 0 is undefined, this would seem to marginally slow down every non undefined program.

    Because n == 0 is a special, well-defined case which allows compiler optimizations (not to call the function). Though of course an extra branch is not necessarily an optimization.


    Sources:

    C17 7.22.5.2

    void qsort(void *base, size_t nmemb, size_t size,
               int (*compar)(const void *, const void *));
    

    C17 7.22.5 emphasis mine:

    These utilities make use of a comparison function to search or sort arrays of unspecified type. Where an argument declared as size_t nmemb specifies the length of the array for a function, nmemb can have the value zero on a call to that function; the comparison function is not called, a search finds no matching element, and sorting performs no rearrangement. Pointer arguments on such a call shall still have valid values, as described in 7.1.4.