32b multiplication in 8b uC with avr-g++ vs 32b multiplication on X86 with gcc

PROBLEM:

I'm doing a fixed point c++ class to perform some closed loop control system on an 8b microcontroller.
I wrote a C++ class to encapsulate the PID and tested the algorithm on an X86 desktop with a modern gcc compiler. All good.
When I compiled the same code on an 8b microcontroller with a modern avr-g++ compiler, I had weird artefacts. After some debugging, the problem was that the 16b*16b multiplication was truncated to 16b. Below some minimal code to show what I'm trying to do.

I used -O2 optimization on the desktop system and -OS optimization on the embedded system, with no other compiler flag.

#include <cstdio>
#include <stdint.h>

#define TEST_16B    true
#define TEST_32B    true

int main( void )
{
    if (TEST_16B)
    {
        int16_t op1 = 9000;
        int16_t op2 = 9;
        int32_t res;
        //This operation gives the correct result on X86 gcc (81000)
        //This operation gives the wrong result on AVR avr-g++ (15464)
        res = (int32_t)0 +op1 *op2;
        printf("op1: %d | op2: %d | res: %d\n", op1, op2, res );
    }

    if (TEST_32B)
    {
        int16_t op1 = 9000;
        int16_t op2 = 9;
        int32_t res;
        //Promote first operand
        int32_t promoted_op1 = op1;
        //This operation gives the correct result on X86 gcc (81000)
        //This operation gives the correct result on AVR avr-g++ (81000)
        res = promoted_op1 *op2;
        printf("op1: %d | op2: %d | res: %d\n", promoted_op1, op2, res );
    }

    return 0;
}

SOLUTION:

Just promoting one operand to 32b with a local variable is enough to solve the problem.

My expectation was that C++ would garantuee that a math operation would be performed at the same width as the first operand, so in my mind res = (int32_t)0 +... should have told the compiler that whatever came after should be performed at int32_t resolution.
This is not what happened. The (int16_t)*(int16_t) operation got truncated to (int16_t).
gcc has an internal word width of at least 32b in an X86 machine, so that might be the reason I didn't see artefacts on my desktop.

AVR Command Line

E:\Programs\AVR\7.0\toolchain\avr8\avr8-gnu-toolchain\bin\avr-g++.exe$(QUOTE) -funsigned-char -funsigned-bitfields -DNDEBUG -I"E:\Programs\AVR\7.0\Packs\atmel\ATmega_DFP\1.3.300\include" -Os -ffunction-sections -fdata-sections -fpack-struct -fshort-enums -Wall -pedantic -mmcu=atmega4809 -B "E:\Programs\AVR\7.0\Packs\atmel\ATmega_DFP\1.3.300\gcc\dev\atmega4809" -c -std=c++11 -fno-threadsafe-statics -fkeep-inline-functions -v -MD -MP -MF "$(@:%.o=%.d)" -MT"$(@:%.o=%.d)" -MT"$(@:%.o=%.o)" -o "$@" "$<"

QUESTION:

Is this the actual expected behaviour of a compliant C++ compiler, meaning I did it wrong, or is this a quirk of the avr-g++ compiler?

UPDATE:

Debugger output of various solutions Cast Comparison

Solution

This is expected behavior of the compiler.

When you write A + B * C, that is equivalent to A + (B * C) because of operator precedence. The B * C term is evaluated on its own, without regard to how it is going to be used later. (Otherwise, it would be really hard to look at C/C++ code and understand what is actually going to happen.)

There are integer promotion rules in the C/C++ standards that sometimes help you out by promoting B and C to be of type int or maybe unsigned int before performing the multiplication. That is why you get the expected result on x86 gcc, where an int has 32 bits. However, since an int in avr-gcc only has 16 bits, the integer promotion is not good enough for you. So you need to cast either B or C to an int32_t to ensure the result of the multiplication will be an int32_t as well. For example, you can do:

A + (int32_t)B * C