Search code examples
c++assemblyx86cpu-registerscalling-convention

Why is __fastcall assebmler code larger than __stdcall one in MS C++?


I have disassembled two different variations of Swap function (simple value-swap between two pointers).

1). __fastcall http://pastebin.com/ux5LMktz

2). __stdcall (function without explicit calling convention modifier will have a __stdcall by default, because of MS C++ compiler for Windows) http://pastebin.com/eGR6VUjX

As I know, __fastcall is implemented differently, depending on the compiler, but basically it puts the first two arguments (left to right) into ECX and EDX register. And there could be stack use, but if the arguments are too long.

But as for the link at 1-st option, you can see, that value is pushed into the ECX registry, and there is no real difference between two variations of swap function.

And __fastcall variant does use:

00AA261F  pop         ecx  
00AA2620  mov         dword ptr [ebp-14h],edx
00AA2623  mov         dword ptr [ebp-8],ecx

Which are not used in __stdcall version.

So it doesn't look like more optimized (as __fasctcall must be , by its definition).

I'm a newbie in ASM language and calling convention, so I ask you for a piece of advice. Maybe __fastcall is faster exactly in my sample, but I don't see it, do I?

Thanks!


Solution

  • Try turning on optimization, then comparing the results. Your fastcall version has many redundant operations because it's not optimized.

    Here's output of VS 2010 with /Ox.

    fastcall:

    ; _firstValue$ = ecx
    ; _secondValue$ = edx
    ?CallMe1@@YIXPAH0@Z PROC                ; CallMe1
        mov eax, DWORD PTR [ecx]
        push    esi
        mov esi, DWORD PTR [edx]
        cmp eax, esi
        je  SHORT $LN1@CallMe1
        mov DWORD PTR [ecx], esi
        mov DWORD PTR [edx], eax
    $LN1@CallMe1:
        pop esi
        ret 0
    ?CallMe1@@YIXPAH0@Z ENDP                ; CallMe1
    

    stdcall:

    _firstValue$ = 8                    ; size = 4
    _secondValue$ = 12                  ; size = 4
    ?CallMe2@@YGXPAH0@Z PROC                ; CallMe2
        mov edx, DWORD PTR _firstValue$[esp-4]
        mov eax, DWORD PTR [edx]
        push    esi
        mov esi, DWORD PTR _secondValue$[esp]
        mov ecx, DWORD PTR [esi]
        cmp eax, ecx
        je  SHORT $LN1@CallMe2
        mov DWORD PTR [edx], ecx
        mov DWORD PTR [esi], eax
    $LN1@CallMe2:
        pop esi
        ret 8
    ?CallMe2@@YGXPAH0@Z ENDP                ; CallMe2
    

    cdecl (what you mistakenly call stdcall in your example):

    _firstValue$ = 8                    ; size = 4
    _secondValue$ = 12                  ; size = 4
    ?CallMe3@@YAXPAH0@Z PROC                ; CallMe3
        mov edx, DWORD PTR _firstValue$[esp-4]
        mov eax, DWORD PTR [edx]
        push    esi
        mov esi, DWORD PTR _secondValue$[esp]
        mov ecx, DWORD PTR [esi]
        cmp eax, ecx
        je  SHORT $LN1@CallMe3
        mov DWORD PTR [edx], ecx
        mov DWORD PTR [esi], eax
    $LN1@CallMe3:
        pop esi
        ret 0
    ?CallMe3@@YAXPAH0@Z ENDP                ; CallMe3