Search code examples
cperformanceembeddedportabilityuint64

Inconveniences of using uint64_t


I have a highly portable library (it compiles and works well everywhere, even without a kernel) and I would like that it remains as portable as possible. So far I have avoided 64bit data types, but I might need to use them now – to be precise I would need a 64bit bitmask.

I have never really thought about it and I am not enough an hardware expert (especially concerning embedded systems), but I am wondering now: what are the inconveniences of using uint64_t (or, equivalently, uint_least64_t)? I can think of two approaches to my question:

  1. Actual portability: Are all microcontrollers – including 8bit CPU – able to deal with 64bit integers?
  2. Performance: How slow will a 8bit CPU perform bitwise operations on a 64bit integer compared to a 32bit integer? The function I am designing will have only one 64bit variable, but will perform a lot of bitwise operations on it (i.e. in a loop).

Solution

  • There are various minimum requirements on a conforming C compiler. The C language allows two forms of compilers: hosted and freestanding. Hosted is meant to run on top of an OS, and freestanding runs without an OS. Most embedded systems compilers are freestanding implementations.

    Freestanding compilers have some leeway, they do not need to support all of the standard libraries, but they need to support a minimum subset of them. This includes stdint.h (see C17 4/6). Which in turn requires the compiler to implement the following (C17 7.20.1.2/3):

    The following types are required:

    int_least8_t int_least16_t int_least32_t int_least64_t
    uint_least8_t uint_least16_t uint_least32_t uint_least64_t

    So a microcontroller compiler does not need to support uint64_t, but it must (oddly enough) support uint_least64_t. In practice it means that the compiler might as well add uint64_t support too, since it's the same thing in this case.

    As for what a 8 bit MCU supports... it supports 8 bit arithmetic through the instruction set, in some special cases also a few 16 bit operations using index registers. But in general, it must rely on software libraries whenever a larger type than 8 bits is used.

    So if you attempt 32 bit arithmetic on a 8 bitter, it will inline some compiler software libraries with the code and the result will be hundreds of assembler instructions, making such code very inefficient and memory-consuming. 64 bit will be even worse.

    Same thing with floating point numbers on MCUs that lack a FPU, these too will generate horribly inefficient code through software floating point libraries.


    To illustrate, take a look at this non-optimized code for some very simple 64 bit addition on an 8-bitter AVR (gcc): https://godbolt.org/z/ezbKjY
    It actually supported uint64_t but the compiler spewed out an enormous amount of overhead code, some 100 instructions. And in the middle of it, a call to an internal compiler function call __adddi3 hidden in the executable.

    If we enable optimizations, we get

    add64:
            push r10
            push r11
            push r12
            push r13
            push r14
            push r15
            push r16
            push r17
            call __adddi3
            pop r17
            pop r16
            pop r15
            pop r14
            pop r13
            pop r12
            pop r11
            pop r10
            ret
    

    We'll have to dig through the library source or single-step the assembly live to see how much code there is inside __adddi3. I would guess it is not a trivial function still.

    So as you hopefully can tell, doing 64 bit arithmetic on an 8-bit CPU is a very bad idea.