Search code examples
c++c++builderc++builder-2009

Understanding behavior of old C++ code


I am migrating some parts of old C++ code, originally compiled with CodeGear C++Builder® 2009 Version 12.0.3170.16989

The following code - minimal version of a bigger piece - outputs -34 with any modern compiler. Although, in the original platform it outputs 84:

char Key[4];    
Key[0] = 0x1F;
Key[1] = 0x01;
Key[2] = 0x8B;
Key[3] = 0x55;

for(int i = 0; i < 2; i++) {
    Key[i] = Key[2*i] ^ Key[2*i + 1];
}

std::cout << (int) Key[1] << std::endl;

enter image description here The following code outputs -34 with both old and new compilers:

for(int i = 0; i < 2; i++) {
    char a = Key[2*i];
    char b = Key[2*i + 1];
    char c = a ^ b;
    Key[i] = c;
}

Also, manually unrolling the loop seems to work with both compilers:

Key[0] = Key[0] ^ Key[1];
Key[1] = Key[2] ^ Key[3];

It is important that I match the behavior of the old code. Can anyone please help me understand why the original compiler produces those results?


Solution

  • This seems to be a bug:

    The line

    Key[i] = Key[2*i] ^ Key[2*i + 1];
    

    generates the following code:

    00401184 8B55F8           mov edx,[ebp-$08]
    00401187 8A4C55FD         mov cl,[ebp+edx*2-$03]
    0040118B 8B5DF8           mov ebx,[ebp-$08]
    0040118E 304C1DFC         xor [ebp+ebx-$04],cl
    

    That does not make sense. This is something like:

    Key[i] ^= Key[i*2 + 1];
    

    And that explains how the result came to be: 0x01 ^ 0x55 is indeed 0x54, or 84.

    It should be something like:

    mov edx,[ebp-$08]
    mov cl,[ebp+edx*2-$04]
    xor cl,[ebp+edx*2-$03]
    mov [ebp+ebx-$04],cl
    

    So this is definitely a code generation bug. It seems to persist until now, C++Builder 10.2 Tokyo, for the "classic" (Borland) compiler.

    But if I use the "new" (clang) compiler, it produces 222. The code produced is:

    File7.cpp.12: Key[i] = Key[2*i] ^ Key[2*i + 1];
    004013F5 8B45EC           mov eax,[ebp-$14]
    004013F8 C1E001           shl eax,$01
    004013FB 0FB64405F0       movzx eax,[ebp+eax-$10]
    00401400 8B4DEC           mov ecx,[ebp-$14]
    00401403 C1E101           shl ecx,$01
    00401406 0FB64C0DF1       movzx ecx,[ebp+ecx-$0f]
    0040140B 31C8             xor eax,ecx
    0040140D 88C2             mov dl,al
    0040140F 8B45EC           mov eax,[ebp-$14]
    00401412 885405F0         mov [ebp+eax-$10],dl
    

    That doesn't look optimal to me (I used O2 and O3 with the same result), but it produces the right result.