Search code examples
c++castingoverflowdowncastunderflow

c++ casting to byte (unit8_t) during subtraction won't force underflow like I expect; output is int16_t; why?


Note that byte is an 8-bit type (uint8_t) and unsigned int is a 16-bit type (uint16_t).

The following doesn't produce the results that I expect. I expect it to underflow and the result to always be a uint8_t, but it becomes a signed int (int16_t) instead!!! Why?

Focus in on the following line of code in particular: (byte)seconds - tStart I expect its output to ALWAYS be an unsigned 8-bit value (uint8_t), but it is instead outputting a signed 16-bit value: int16_t.

How do I get the result of the subraction to always be a uint8_t type?

while (true)
{
  static byte tStart = 0;
  static unsigned int seconds = 0;
  seconds++;

  //Print output from microcontroller
  typeNum((byte)seconds); typeString(", "); typeNum(tStart); typeString(", "); 
  typeNum((byte)seconds - tStart); typeString("\n");

  if ((byte)seconds - tStart >= (byte)15)
  {
    typeString("TRUE!\n");
    tStart = seconds; //update 
  }
}

Sample output:

Column 1 is (byte)seconds, Column 2 is tStart, Column 3 is Column 1 minus Column 2 ((byte)seconds - tStart) Notice that Column 3 becomes negative (int8_t) once Column 1 overflows from 255 to 0. I expect (and want) it to remain a positive (unsigned) 8-bit value by underflowing instead.

196, 195, 1
197, 195, 2
198, 195, 3
199, 195, 4
200, 195, 5
201, 195, 6
202, 195, 7
203, 195, 8
204, 195, 9
205, 195, 10
206, 195, 11
207, 195, 12
208, 195, 13
209, 195, 14
210, 195, 15
TRUE!
211, 210, 1
212, 210, 2
213, 210, 3
214, 210, 4
215, 210, 5
216, 210, 6
217, 210, 7
218, 210, 8
219, 210, 9
220, 210, 10
221, 210, 11
222, 210, 12
223, 210, 13
224, 210, 14
225, 210, 15
TRUE!
226, 225, 1
227, 225, 2
228, 225, 3
229, 225, 4
230, 225, 5
231, 225, 6
232, 225, 7
233, 225, 8
234, 225, 9
235, 225, 10
236, 225, 11
237, 225, 12
238, 225, 13
239, 225, 14
240, 225, 15
TRUE!
241, 240, 1
242, 240, 2
243, 240, 3
244, 240, 4
245, 240, 5
246, 240, 6
247, 240, 7
248, 240, 8
249, 240, 9
250, 240, 10
251, 240, 11
252, 240, 12
253, 240, 13
254, 240, 14
255, 240, 15
TRUE!
0, 255, -255
1, 255, -254
2, 255, -253
3, 255, -252
4, 255, -251
5, 255, -250
6, 255, -249
7, 255, -248
8, 255, -247
9, 255, -246
10, 255, -245
11, 255, -244
12, 255, -243
13, 255, -242
14, 255, -241
15, 255, -240
16, 255, -239
17, 255, -238
18, 255, -237
19, 255, -236
20, 255, -235
21, 255, -234
22, 255, -233
23, 255, -232
24, 255, -231
25, 255, -230
26, 255, -229
27, 255, -228
28, 255, -227
29, 255, -226
30, 255, -225
31, 255, -224
32, 255, -223
33, 255, -222
34, 255, -221
35, 255, -220

Here is the typeNum function from above:

//--------------------------------------------------------------------------------------------
//typeNum (overloaded)
//-see AVRLibC int to string functions: http://www.nongnu.org/avr-libc/user-manual/group__avr__stdlib.html
//--------------------------------------------------------------------------------------------
//UNSIGNED:
void typeNum(uint8_t myNum)
{
  char buffer[4]; //3 for the number (up to 2^8 - 1, or 255 max), plus 1 char for the null terminator 
  utoa(myNum, buffer, 10); //base 10 number system 
  typeString(buffer);
}
void typeNum(uint16_t myNum)
{
  char buffer[6]; //5 for the number (up to 2^16 - 1, or 65535 max), plus 1 char for the null terminator 
  utoa(myNum, buffer, 10); //base 10 number system 
  typeString(buffer);
}
void typeNum(uint32_t myNum)
{
  char buffer[11]; //10 chars for the number (up to 2^32 - 1, or 4294967295 max), plus 1 char for the null terminator 
  ultoa(myNum, buffer, 10); //base 10 number system 
  typeString(buffer);
}

//SIGNED:
void typeNum(int8_t myNum)
{
  char buffer[5]; //4 for the number (down to -128), plus 1 char for the null terminator 
  itoa(myNum, buffer, 10); //base 10 number system 
  typeString(buffer);
}
void typeNum(int16_t myNum)
{
  char buffer[7]; //6 for the number (down to -32768), plus 1 char for the null terminator 
  itoa(myNum, buffer, 10); //base 10 number system 
  typeString(buffer);
}
void typeNum(int32_t myNum)
{
  char buffer[12]; //11 chars for the number (down to -2147483648), plus 1 char for the null terminator 
  ltoa(myNum, buffer, 10); //base 10 number system 
  typeString(buffer);
}

Solution

  • So I figured it out:

    The answer is very simple, but the understanding behind it is not.

    Answer:

    (How to fix it):
    Instead of using (byte)seconds - tStart, use (byte)((byte)seconds - tStart). That's it! Problem solved! All you need to do is cast the output of the mathematical operation (a subtraction in this case) to a byte as well, and it's fixed! Otherwise it returns as a signed int, which produces the errant behavior.

    So, why does this happen?

    Answer:
    In C, C++, and C#, there is no such thing as a mathematical operation on a byte! Apparently the assembly level functions required for operators like +, -, etc, don't exist for byte inputs. Instead, all bytes are first implicitly cast (promoted) to an int before the operation is conducted, then the mathematical operation is conducted on ints, and when it is completed, it returns an int too!

    So, this code (byte)seconds - tStart is implicitly cast (promoted in this case) by the compiler as follows: (int)(byte)seconds - (int)tStart...and it returns an int too. Confusing, eh? I certainly thought so!

    Here's some more reading on the matter:

    (the more asterisks, *, the more useful)

    Now let's look at some real C++ examples:

    Here is a full C++ program you can compile and run to test expressions to see what the return type is, and if it has been implicitly cast by to the compiler to something you don't intend:

    #include <iostream>
    
    using namespace std;
    
    //----------------------------------------------------------------
    //printTypeAndVal (overloaded function)
    //----------------------------------------------------------------
    //UNSIGNED:
    void printTypeAndVal(uint8_t myVal)
    {
      cout << "uint8_t = " << (int)myVal << endl; //(int) cast is required to prevent myVal from printing as a char
    }
    void printTypeAndVal(uint16_t myVal)
    {
      cout << "uint16_t = " << myVal << endl;
    }
    void printTypeAndVal(uint32_t myVal)
    {
      cout << "uint32_t = " << myVal << endl;
    }
    void printTypeAndVal(uint64_t myVal)
    {
      cout << "uint64_t = " << myVal << endl;
    }
    //SIGNED:
    void printTypeAndVal(int8_t myVal)
    {
      cout << "int8_t = " << (int)myVal << endl; //(int) cast is required to prevent myVal from printing as a char
    }
    void printTypeAndVal(int16_t myVal)
    {
      cout << "int16_t = " << myVal << endl;
    }
    void printTypeAndVal(int32_t myVal)
    {
      cout << "int32_t = " << myVal << endl;
    }
    void printTypeAndVal(int64_t myVal)
    {
      cout << "int64_t = " << myVal << endl;
    }
    //FLOATING TYPES:
    void printTypeAndVal(float myVal)
    {
      cout << "float = " << myVal << endl;
    }
    void printTypeAndVal(double myVal)
    {
      cout << "double = " << myVal << endl;
    }
    void printTypeAndVal(long double myVal)
    {
      cout << "long double = " << myVal << endl;
    }
    
    //----------------------------------------------------------------
    //main
    //----------------------------------------------------------------
    int main()
    {
      cout << "Begin\n\n";
    
      //Test variables
      uint8_t u1 = 0;
      uint8_t u2 = 1;
    
      //Test cases:
    
      //for a single byte, explicit cast of the OUTPUT from the mathematical operation is required to get desired *unsigned* output
      cout << "uint8_t - uint8_t:" << endl;
      printTypeAndVal(u1 - u2); //-1 (bad)
      printTypeAndVal((uint8_t)u1 - (uint8_t)u2); //-1 (bad)
      printTypeAndVal((uint8_t)(u1 - u2)); //255 (fixed!)
      printTypeAndVal((uint8_t)((uint8_t)u1 - (uint8_t)u2)); //255 (fixed!)
      cout << endl;
    
      //for unsigned 2-byte types, explicit casting of the OUTPUT is required too to get desired *unsigned* output
      cout << "uint16_t - uint16_t:" << endl;
      uint16_t u3 = 0;
      uint16_t u4 = 1;
      printTypeAndVal(u3 - u4); //-1 (bad)
      printTypeAndVal((uint16_t)(u3 - u4)); //65535 (fixed!)
      cout << endl;
    
      //for larger standard unsigned types, explicit casting of the OUTPUT is ***NOT*** required to get desired *unsigned* output! IN THIS CASE, NO IMPLICIT PROMOTION (CAST) TO A LARGER *SIGNED* TYPE OCCURS.
      cout << "unsigned int - unsigned int:" << endl;
      unsigned int u5 = 0;
      unsigned int u6 = 1;
      printTypeAndVal(u5 - u6); //4294967295 (good--no fixing is required)
      printTypeAndVal((unsigned int)(u5 - u6)); //4294967295 (good--no fixing was required)
      cout << endl;
    
      return 0;
    }
    

    You can also run this program online here: http://cpp.sh/6kjgq

    Here is the output. Notice that both the single unsigned byte uint8_t - uint8_t case, and the dual unsigned byte uint16_t - uint16_t case each were implicitly cast (promoted) by the C++ compiler to a 4-byte signed int32_t (int) variable type. This is the behavior that you need to notice. Therefore, the result of those subtractions is negative, which is the unusual behavior that originally confused me, since I had anticipated it would instead underflow to become the unsigned variable's maximum value instead (since we are doing 0 - 1). In order to achieve the desired underflow then, I had to explicitly cast the output result of the subtraction to the desired unsigned type, not just the inputs. For the unsigned int case, however, this explicit cast of the result was NOT required.

    Begin

    uint8_t - uint8_t:
    int32_t = -1
    int32_t = -1
    uint8_t = 255
    uint8_t = 255

    uint16_t - uint16_t:
    int32_t = -1
    uint16_t = 65535

    unsigned int - unsigned int:
    uint32_t = 4294967295
    uint32_t = 4294967295

    Here's another brief program example to show that the single unsigned byte (unsigned char) variables are being promoted to signed ints (int) when operated upon.

    #include <stdio.h>
    
    int main(int argc, char **argv) 
    {
      unsigned char x = 130;
      unsigned char y = 130;
      unsigned char z = x + y;
    
      printf("%u\n", x + y); // Prints 260.
      printf("%u\n", z);     // Prints 4.
    }
    

    Output:

    260
    4

    Test here: http://cpp.sh/84eo