Search code examples
c++visual-c++intrinsicsmemsetdemoscene

How to use VC++ intrinsic functions w/o run-time library


I'm involved in one of those challenges where you try to produce the smallest possible binary, so I'm building my program without the C or C++ run-time libraries (RTL). I don't link to the DLL version or the static version. I don't even #include the header files. I have this working fine.

Some RTL functions, like memset(), can be useful, so I tried adding my own implementation. It works fine in Debug builds (even for those places where the compiler generates an implicit call to memset()). But in Release builds, I get an error saying that I cannot define an intrinsic function. You see, in Release builds, intrinsic functions are enabled, and memset() is an intrinsic.

I would love to use the intrinsic for memset() in my release builds, since it's probably inlined and smaller and faster than my implementation. But I seem to be a in catch-22. If I don't define memset(), the linker complains that it's undefined. If I do define it, the compiler complains that I cannot define an intrinsic function.

Does anyone know the right combination of definition, declaration, #pragma, and compiler and linker flags to get an intrinsic function without pulling in RTL overhead?

Visual Studio 2008, x86, Windows XP+.

To make the problem a little more concrete:

extern "C" void * __cdecl memset(void *, int, size_t);

#ifdef IMPLEMENT_MEMSET
void * __cdecl memset(void *pTarget, int value, size_t cbTarget) {
    char *p = reinterpret_cast<char *>(pTarget);
    while (cbTarget > 0) {
        *p++ = static_cast<char>(value);
        --cbTarget;
    }
    return pTarget;
}
#endif

struct MyStruct {
    int foo[10];
    int bar;
};

int main() {
    MyStruct blah;
    memset(&blah, 0, sizeof(blah));
    return blah.bar;
}

And I build like this:

cl /c /W4 /WX /GL /Ob2 /Oi /Oy /Gs- /GF /Gy intrinsic.cpp
link /SUBSYSTEM:CONSOLE /LTCG /DEBUG /NODEFAULTLIB /ENTRY:main intrinsic.obj

If I compile with my implementation of memset(), I get a compiler error:

error C2169: 'memset' : intrinsic function, cannot be defined

If I compile this without my implementation of memset(), I get a linker error:

error LNK2001: unresolved external symbol _memset

Solution

  • I think I finally found a solution:

    First, in a header file, declare memset() with a pragma, like so:

    extern "C" void * __cdecl memset(void *, int, size_t);
    #pragma intrinsic(memset)
    

    That allows your code to call memset(). In most cases, the compiler will inline the intrinsic version.

    Second, in a separate implementation file, provide an implementation. The trick to preventing the compiler from complaining about re-defining an intrinsic function is to use another pragma first. Like this:

    #pragma function(memset)
    void * __cdecl memset(void *pTarget, int value, size_t cbTarget) {
        unsigned char *p = static_cast<unsigned char *>(pTarget);
        while (cbTarget-- > 0) {
            *p++ = static_cast<unsigned char>(value);
        }
        return pTarget;
    }
    

    This provides an implementation for those cases where the optimizer decides not to use the intrinsic version.

    The outstanding drawback is that you have to disable whole-program optimization (/GL and /LTCG). I'm not sure why. If someone finds a way to do this without disabling global optimization, please chime in.