What's the benefit of unmapping a resource in DirectX12

In a past question DirectX12 Upload Synchronization D3D12_HEAP_TYPE_UPLOAD, I got into trouble unmaping an upload resource, using it in a command list and executing, then mapping again and overwritting before the gpu had used the previous data

I must have thought mapping the second time would give me different memory to write into, if the gpu hadn't finished using the unmapped data.

So if this is not the case, then what is the point of unmapping in directX12?

Chuck Walbourn said

Take data from the CPU and copy it into the 'intermediate' resource (unmapping it when complete since there's no need to keep the virtual memory address assignment around).

I guess I don't even know whether virtual memory is in cpu or gpu memory, (maybe it's not in standard cpu or gpu memory; it's in some special memory on the gpu, or maybe it's device dependant hence the vagueness of what virtual memory is).

Solution

First, I think it's worth addressing the remapping issue.

In DX11, the driver does all the heavy lifting, so when you map (write/discard) a resource the driver's doing a bunch of work under the hood, specifically allocating a new buffer and returning you the address (referred to as "resource renaming"). The driver will track when the GPU is done with a particular bit of memory, and will manage when unused memory can be re-used.

For modern APIs (both DX12 and Vulkan) when you create a resource, the resource is explicitly bound to a location in memory. It's a much thinner layer (you're closer to the metal). When you map, you get a pointer. You can keep the resource mapped, forever, and that pointer returned will always be valid, and will always point to the address in memory the GPU will read from. The advantage here is that, since your application knows how it'll be using these resources, you can optimize for your specific use case. For example, if you have a constant buffer for view-dependent data that updates once a frame, and you're buffering 3 frames, you can just create 3 resources, map them all, and just round-robin through them - saving the overhead of API calls etc.

On the virtual memory front, when you map, you get a pointer that's a virtual memory address, that'll be mapped to somewhere in physical, CPU-side memory. So the mapping is definitely to physical CPU memory. How that memory is mapped to the GPU is probably device/system dependent, but I believe in most cases the memory lives on the CPU-side memory and is read by the GPU via the PCIe bus (which is why you upload rather than just let the GPU read from that resource directly).

Given that most apps these days are built for 64-bit architectures, we're generally not super limited with virtual memory address space, but it's still not a bad idea to clean up if you're not going to be using it, since it's still using up resources (page tables for virtual memory mapping, etc).