Search code examples
cassemblyrootinline-assemblysqrt

Fastest assembly code for finding the square root. Explanation needed


I'm currently making a program in C that needs to find billions of square roots. I looked up which known code finds the square root faster and came across this code which is seemingly the fastest. https://www.codeproject.com/Articles/69941/Best-Square-Root-Method-Algorithm-Function-Precisi

double inline __declspec (naked) __fastcall sqrt(double n)
{
    _asm fld qword ptr[esp + 4]
    _asm fsqrt
    _asm ret 8
}

I don't know much about assembly language so can someone please explain what this code does algorithmically and what those keywords mean?


Solution

  • This is Microsoft Specific naked fast call of the standard sqrt function.

    For detail info please check Microsoft documentation.

    The naked storage-class attribute is a Microsoft-specific extension to the C language. For functions declared with the naked storage-class attribute, the compiler generates code without prolog and epilog code. You can use this feature to write your own prolog/epilog code sequences using inline assembler code. Naked functions are particularly useful in writing virtual device drivers. See: Naked functions.

    The __fastcall calling convention specifies that arguments to functions are to be passed in registers, when possible. This calling convention only applies to the x86 architecture. Take a look at: __fastcall

    __fastcall was introduced a long time ago by Microsoft. Typically fastcall calling conventions pass one or more arguments in registers which reduces the number of memory accesses required for the call. With on-chip caching, the gain from passing things in registers is not a much gain as it use to be. And __stdcall may be actually faster now.