Search code examples
llvmllvm-clangllvm-ir

What is the purpose of the intrinsic cvta_shared_yes, cvta_shared_yes_64, cvta_to_shared_yes_64 etc in llvm


In the LLVM source code folder we can see the intrinsic cvta_shared_yes, cvta_shared_yes_64, cvta_to_shared_yes_64 similarly for other memory types like global, local, constant etc. What is the purpose of this. Is it defining the behavior of memory types ? If so can we add a new intrinsic ?


Solution

  • These intrinsics are used by the NVPTX backend for emitting PTX special operations converting pointers to the global, local, shared or constant memory to the generic address space and back. This is NVPTX backend specific and represents the memory hierarchy on an Nvidia (CUDA) GPU.

    If you want to add intrinsics to LLVM have a look at llvm/include/llvm/IR/Intrinsics*.td TableGen files. These files are used to generate everything necessary for an intrinsic. For example:

    def int_memcpy  : Intrinsic<[],
                                   [llvm_anyptr_ty, llvm_anyptr_ty, llvm_anyint_ty,
                                    llvm_i32_ty, llvm_i1_ty],
                                  [IntrReadWriteArgMem, NoCapture<0>, NoCapture<1>,
                                   ReadOnly<1>]>;
    

    will generate the llvm.memcpy intrinsic, which can be used by a backend to generate calls to memcpy functions of a specific systems.

    However, keep in mind that the backend has to support your new intrinsics somehow. You can look at ./llvm/lib/Target/X86/X86ISelLowering.cpp how the X86 backend handles the llvm.memcpy intrinsic.