Search code examples
c++low-latencyprefetch

How can I prefetch infrequently used code?


I want to prefetch some code into the instruction cache. The code path is used infrequently but I need it to be in the instruction cache or at least in L2 for the rare cases that it is used. I have some advance notice of these rare cases. Does _mm_prefetch work for code? Is there a way to get this infrequently used code in cache? For this problem I don't care about portability so even asm would do.


Solution

  • The answer depends on your CPU architecture.

    That said, if you are using gcc or clang, you can use the __builtin_prefetch instruction to try to generate a prefetch instruction. On Pentium 3 and later x86-type architectures, this will generate a PREFETCHh instruction, which requests a load into the data cache hierarchy. Since these architectures have unified L2 and higher caches, it may help.

    The function looks like this:

    __builtin_prefetch(const void *address, int locality);
    

    The locality argument should be in the range 0...3. Assuming locality maps directly to the h part of the PREFETCHh instruction, you want to pass 1 or 2, which ask for the data to be loaded into the L2 and higher caches. See Intel® 64 and IA-32 Architectures Software Developer's Manual Volume 2B: Instruction Set Reference, M-Z (PDF) page 4-277. (Find other volumes here.)

    If you're using another compiler that doesn't have __builtin_prefetch, see whether it has the _mm_prefetch function. You may need to include a header file to get that function. For example, on OS X, that function, and constants for the locality argument, are declared in xmmintrin.h.