System Tick rollover on STM32 32bit ARM architecture

Having trouble understanding what happens when the 32bit system tick on a STM32 MCU rolls over using the ST supplied HAL platform.

If the MCU has been running until HAL_GetTick() returns its maximum of 2^32 -1 =0xFFFFFFFF which is 4,294,967,295 / 1000 / 60 / 60 / 24 = approx 49 days (when calculating the 1ms tick to the maximum duration that can be measured). What happens if you have a timer that running across the rollover point?

Example code creating 100ms delay on a rollover event:

uint32_t start = HAL_GetTick()  // start = 0xFFFF FFFF (in this example)

--> Interrupt increments systick which rolls it over to 0 at this point

while ((HAL_GetTick() - start) < 100);

So when the expression in the loop is first evaluated HAL_GetTick() = 0x0000 0000 and start = 0xFFFF FFFF. Hence 0x0000 00000 - 0xFFFF FFFF = ? (This number doesn't exist as it's negative and we are doing unsigned arithmetic)

However when I run the following code on my STM32 that is compiled with the GCC ARM :

   uint32_t a = 0xFFFFFFFFUL;
   uint32_t b = 0x00000000UL;
   uint32_t c = b - a;
   printf("a =%lu b=%lu c=%lu\r\n", a, b, c);

The output is: a =4294967295 b=0 c=1

The fact that c=1 is good from the point of view of the code functioning properly across the overflow but I don't understand what is actually happening here at the low level. How does 0 - 4294967295 = 1 ?? How would I calculate this on paper to show what the arithmetic logic unit inside the MCU is doing when this situation is encountered?

Solution

This is a characteristic of modular arithmetic. Or modulo wrapping is what happens when an unsigned integer overflows.

When working with a fixed number of digits/bits, arithmetic operations can overflow the fixed number of digits. But the overflow portion cannot be represented in the fixed number of digits/bits and is basically masked away. The overflow portion can be considered a modulus and the portion within the fixed number of digits/bits is the remainder or modulo. Given the modulus, the modulo value remains correct/congruent after the operation that caused the overflow.

The best way to understand is to do a few operations with a pen on paper. Choose a base. Hexadecimal is great but it works for decimal, binary, and every base. Choose a fixed number of digits/bits. For uint32_t you have 8 hex digits or 32 bits. Choose two values that will overflow the fixed number of digits when you add them. Do the math on paper and include any overflow into an extra digit. Now perform the modulo operation by covering the overflow with your hand. Your CPU does this modulo operation automatically by virtue of having a fixed number of digits (i.e., uint32_t). Repeat this with different numbers and repeat with a subtraction/underflow. Eventually you'll start to trust that it works.

You do have to be careful when setting up this operation. Use unsigned types and subtract the start ticks value from the current ticks value, like is done in your example code. (Do not, for example, add the delay to start ticks and compare with the current ticks.) Raymond Chen's article, Using modular arithmetic to avoid timing overflow problems has more information.

How does 0 - 4294967295 = 1 ?? How would I calculate this on paper to show what the arithmetic logic unit inside the MCU is doing when this situation is encountered?

First write it in hex like this:

     0000_0000
 -   FFFF_FFFF
 _____________

Then realize that there can be a modulus value of 0x1_0000_0000 on the first value (minuend). (Because according to modular arithmetic, "0x0_0000_0000 and 0x1_0000_0000 are congruent modulo 0x1_0000_0000"). Then it should become obvious that the difference is 1.

   1_0000_0000
 - 0_FFFF_FFFF
 _____________
   0_0000_0001