Search code examples
c++cmingwmingw-w64

Possible bug with variadic arguments on mingw64-gcc


I had an annoying bug which I tried to track down, then I created an example and I'm still not 100% sure if it is a compiler problem.

Let me give you some information about the version I used first.

x86_64-w64-mingw32-g++ --version

x86_64-w64-mingw32-g++.exe (Rev1, Built by MSYS2 project) 7.2.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

I know it is not the newest version, but it is the newest you could get for MSYS.

This is the example code:

#include <cstdint>
#include <stdio.h>
#include <string.h>
#include <cstdarg>

void test1(){
    uint64_t a = 0x3333333333333333;
    uint64_t b = 1;
    uint64_t c = 2;
    uint64_t d = 3;
    printf("output should be:\n3 2 1 0 3333333333333333\n");
    printf("but output is:\n%llx %llx %llx %llx %llx\n",d,c,b,0,a);
}
void test(uint64_t x1,uint64_t x2,uint64_t x3,uint64_t x4,uint64_t x5,uint64_t x6,
uint64_t x21,uint64_t x22,uint64_t x23,uint64_t x24,uint64_t x25,uint64_t x26,
uint64_t x31,uint64_t x32,uint64_t x33,uint64_t x34,uint64_t x35,uint64_t x36,
uint64_t x41,uint64_t x42,uint64_t x43,uint64_t x44,uint64_t x45,uint64_t x46){
    printf("start\n");
}
void test_(){
        test(0x7777777777777771,0x7777777777777772,0x7777777777777773,0x7777777777777774,0x7777777777777775,0x7777777777777776,
        0x7777777777777771,0x7777777777777772,0x7777777777777773,0x7777777777777774,0x7777777777777775,0x7777777777777776,
        0x7777777777777771,0x7777777777777772,0x7777777777777773,0x7777777777777774,0x7777777777777775,0x7777777777777776,
        0x7777777777777771,0x7777777777777772,0x7777777777777773,0x7777777777777774,0x7777777777777775,0x7777777777777776);
}
int main(int argc,char** argv){
    test_();
    test1();
}

and compiled & executed it with:

x86_64-w64-mingw32-g++ -O0 test.cpp && ./a.exe

Now comes the surprising part, the output is:

start output should be: 3 2 1 0 3333333333333333 but output is: 3 2 1 7777777700000000 3333333333333333

In the example above i use printf to produce and visualize the problem.

It could happen on any other function instead of printf that uses variational arguments.

For example: void blah(a,b,...)

For some reason the compiler does this unexpected thing. Searching by google didn't lead me to the right direction sadly.

It leads me to the question if this is really a problem with the compiler (linux didn't have such a problem), or if it is a programming mistake (like forgetting to cast the 0 number).

Taking a look at the disassembled code shows me the part that produces the problem:

objdump -M intel -S ./a.exe|egrep -A 30 'test1.+:'
0000000000401570 <_Z5test1v>:
  401570:       55                      push   rbp
  401571:       48 89 e5                mov    rbp,rsp
  401574:       48 83 ec 50             sub    rsp,0x50
  401578:       48 b8 33 33 33 33 33    movabs rax,0x3333333333333333
  40157f:       33 33 33
  401582:       48 89 45 f8             mov    QWORD PTR [rbp-0x8],rax
  401586:       48 c7 45 f0 01 00 00    mov    QWORD PTR [rbp-0x10],0x1
  40158d:       00
  40158e:       48 c7 45 e8 02 00 00    mov    QWORD PTR [rbp-0x18],0x2
  401595:       00
  401596:       48 c7 45 e0 03 00 00    mov    QWORD PTR [rbp-0x20],0x3
  40159d:       00
  40159e:       48 8d 0d 5b 7a 00 00    lea    rcx,[rip+0x7a5b]        # 409000 <.rdata>
  4015a5:       e8 a6 66 00 00          call   407c50 <_Z6printfPKcz>
  4015aa:       4c 8b 45 f0             mov    r8,QWORD PTR [rbp-0x10]
  4015ae:       48 8b 4d e8             mov    rcx,QWORD PTR [rbp-0x18]
  4015b2:       48 8b 45 e0             mov    rax,QWORD PTR [rbp-0x20]
  4015b6:       48 8b 55 f8             mov    rdx,QWORD PTR [rbp-0x8]
  4015ba:       48 89 54 24 28          mov    QWORD PTR [rsp+0x28],rdx
  4015bf:       c7 44 24 20 00 00 00    mov    DWORD PTR [rsp+0x20],0x0
  4015c6:       00
  4015c7:       4d 89 c1                mov    r9,r8
  4015ca:       49 89 c8                mov    r8,rcx
  4015cd:       48 89 c2                mov    rdx,rax
  4015d0:       48 8d 0d 59 7a 00 00    lea    rcx,[rip+0x7a59]        # 409030 <.rdata+0x30>
  4015d7:       e8 74 66 00 00          call   407c50 <_Z6printfPKcz>
  4015dc:       90                      nop
  4015dd:       48 83 c4 50             add    rsp,0x50
  4015e1:       5d                      pop    rbp
  4015e2:       c3                      ret

and I have absolutely no clue why it uses that dword on offset 4015bf. Maybe somebody can shed some light on my problem or is able to test it with a newer mingw-version.

(I already tried with a "bionic beaver" docker image of ubuntu, but sadly with the same result ... well it has the same version of x86_64-w64-mingw32-g++ anyway)


Solution

  • You have an argument type mismatch:

     printf("but output is:\n%llx %llx %llx %llx %llx\n",d,c,b,0,a);
    

    The value 0 has type int, but the %llx format specifier is expecting a variable of type unsigned long long int. Using the wrong format specifier invokes undefined behavior.

    Because printf is a variadic function, it can't automatically convert this value to the proper type. So you need to either use the correct format specifier:

     printf("but output is:\n%llx %llx %llx %d %llx\n",d,c,b,0,a);
    

    Or cast the argument in question

     printf("but output is:\n%llx %llx %llx %llu %llx\n",d,c,b,(unsigned long long)0,a);
    

    Or (in the case of a constant) use the proper type suffix

     printf("but output is:\n%llx %llx %llx %llu %llx\n",d,c,b,0ULL,a);