Search code examples
cgcctail-recursion

GCC doesn't tail call optimize recursive function


I have written a hexadecimal-parsing function for my uint128 structure, which is internally just two uint64_t's - hi and lo. Here's the function in question:

uint128_t parse_from_hex(const char *string, uint128_t previous_value, const size_t num_digits) {
    if (string == NULL || string[0] == '\0')
        return previous_value;

    if (num_digits + 1 > 2 * SIZEOF_INT128) { // SIZEOF_INT128 is 16 - meaning 16 bytes
        return UINT128_MAX; // the maximum value of 128bit uint, if we overflow
    }

    int64_t current_digit = parse_hex_digit(string[0]);
    if (current_digit < 0)
        return UINT128_ZERO; // a global variable which I use multiple times which represents a 0
    return parse_from_hex(string + 1,
                          uint128_or_uint64(uint128_shift_left(previous_value, 4), (uint64_t)current_digit),
                          num_digits + 1);
}

For some reason, gcc does not optimize the function even though the recursive call is clearly made a single time at the end of the function. The other functions used in the parsing function don't have any side effects and return a new value, so I do not think that the problem is with them. I have tried making the uint128_t struct members non-const (originally they were non-const) as well as the function arguments non-const, but that didn't help either. Originally compiled with Ofast, but also tried with O3 and O2 - no luck. Could anyone who knows better on the subject please help? I thought I understood it quite well but clearly I'm missing something.


Solution

  • As it has been pointed out by @BillLynch in the comments - it's clang which doesn't optimize the function for some reason, not GCC. On my PC GCC 10.2.0 optimizes the function properly, so there's no problem here.