Search code examples
cudanvidiaraycastingoptix

Is Hardware Accelerated Min/Max Ray Casting Available with Cuda/Optix?


I am wondering if it is possible in Cuda or Optix to accelerate the computation of the minimum and maximum value along a line/ray casted from one point to another in a 3D volume.

If not, is there any special hardware on Nvidia GPU's that can accelerate this function (particularly on Volta GPUs or Tesla K80's)?


Solution

  • The short answer to the title question is: yes, hardware accelerated ray casting is available in CUDA & OptiX. The longer question has multiple interpretations, so I'll try to outline the different possibilities.

    The different axes of your question that I'm seeing are: CUDA vs OptiX, pre-RTX GPUs vs RTX GPUs (e.g., Volta vs Ampere), min ray queries vs max ray queries, and possibly surface representations vs volume representations.

    pre-RTX vs RTX GPUs:

    To perhaps state the obvious, a K80 or a GV100 GPU can be used to accelerate ray casting compared to a CPU, due to the highly parallel nature of the GPU. However, these pre-RTX GPUs don't have any hardware that is specifically dedicated to ray casting. There are bits of somewhat special purpose hardware not dedicated to ray casting that you could probably leverage in various ways, so up to you to identify and design these kinds of hardware acceleration hacks.

    The RTX GPUs starting with the Turing architecture do have specialized hardware dedicated to ray casting, so they accelerate ray queries even further than the acceleration you get from using just any GPU to parallelize the ray queries.

    CUDA vs OptiX:

    CUDA can be used for parallel ray tracing on any GPUs, but it does not currently (as I write this) support access to the specialized RTX hardware for ray tracing. When using CUDA, you would be responsible for writing all the code to build an acceleration structure (e.g. BVH) & traverse rays through the acceleration structure, and you would need to write the intersection and shading or hit-processing programs.

    OptiX, Direct-X, and Vulkan all allow you to access the specialized ray-tracing hardware in RTX GPUs. By using these APIs, one can achieve higher speeds with lower power requirements, and they also require much less effort because the intersections and ray traversal through an acceleration structure are provided for you. These APIs also provide other commonly needed features for production-level ray casting, things like instancing, transforms, motion blur, as well as a single-threaded programming model for processing ray hits & misses.

    Min vs Max ray queries:

    OptiX has built-in functionality to return the surface intersection closest to the ray origin, i.e. a 'min query'. OptiX does not provide a similar single query for the furthest intersection (which is what I assume you mean by "max"). To find the maximum distance hit, or the closest hit to a second point on your ray, you would need to track through multiple hits and keep track of the hit that you want.

    In CUDA you're on your own for detecting both min and max queries, so you can do whatever you want as long as you can write all the code.

    Surfaces vs Volumes:

    Your question mentioned a "3D volume", which has multiple meanings, so just to clarify things:

    OptiX (+ DirectX + Vulkan) are APIs for ray tracing of surfaces, for example triangles meshes. The RTX specialty hardware is dedicated to accelerating ray tracing of surface based representations.

    If your "3D volume" is referring to a volumetric representation such as voxel data or a tetrahedral mesh, then surface-based ray tracing might not be the fastest or most appropriate way to cast ray queries. In this case, you might want to use "ray marching" techniques in CUDA, or look at volumetric ray casting APIs for GPUs like NanoVDB.