Search code examples
memory-managementuploaddirect3d

Direct3D virtual GPU address


There is the code that create constant buffer for upload some data to a GPU memory:

    void BoxApp::BuildConstantBuffers()
    {

        mObjectCB = std::make_unique<UploadBuffer<ObjectConstants>>(md3dDevice.Get(), 1, true);

        UINT objCBByteSize = d3dUtil::CalcConstantBufferByteSize(sizeof(ObjectConstants));

        D3D12_GPU_VIRTUAL_ADDRESS cbAddress = mObjectCB->Resource()->GetGPUVirtualAddress();

    D3D12_CONSTANT_BUFFER_VIEW_DESC cbvDesc; 

    cbvDesc.BufferLocation = cbAddress; 

    cbvDesc.SizeInBytes = objCBByteSize; 

    md3dDevice->CreateConstantBufferView( 
        &cbvDesc,
        mCbvHeap->GetCPUDescriptorHandleForHeapStart());

} 

Where UploadBuffer is:

template<typename T>
class UploadBuffer
{
public:

    UploadBuffer(ID3D12Device* device, UINT elementCount, bool isConstantBuffer) : 
        mIsConstantBuffer(isConstantBuffer)
    {
        mElementByteSize = sizeof(T);

        if(isConstantBuffer)
            mElementByteSize = d3dUtil::CalcConstantBufferByteSize(sizeof(T));

        ThrowIfFailed(device->CreateCommittedResource(
            &CD3DX12_HEAP_PROPERTIES(D3D12_HEAP_TYPE_UPLOAD),
            D3D12_HEAP_FLAG_NONE,
            &CD3DX12_RESOURCE_DESC::Buffer(mElementByteSize*elementCount),
            D3D12_RESOURCE_STATE_GENERIC_READ,
            nullptr,
            IID_PPV_ARGS(&mUploadBuffer)));

        ThrowIfFailed(mUploadBuffer->Map(0, nullptr, reinterpret_cast<void**>(&mMappedData)));


    }
}

CreateConstantBufferView use two address in memory: 1) The Heap start, 2) The Virtual GPU memory address by the BufferLocation field

Where is the buffer (constant object) will be created physically? Why this method use two different addresses?


Solution

  • The CPU address is used when the CPU is accessing the memory. In the case of the data in D3D12_HEAP_TYPE_UPLOAD, that address is used to write data into the resource because it's in some kind of 'shared memory' that both the CPU & GPU can access. The CPU address is a virtual memory address mapped to the correct physical location for the kind of access that is required.

    The GPU address is used when the GPU is accessing the memory, typically when it's used by the input assembler for geometry (VB/IB) or inside the sampler/texture descriptor heaps. For D3D12_HEAP_TYPE_DEFAULT, the memory is only accessible to the GPU so there's not really a CPU address. The GPU can directly read D3D12_HEAP_TYPE_UPLOAD resource as well. The GPU address is a virtual address that's specific to the GPU's memory architecture.

    For Unified Memory Architecture (UMA) systems like Xbox One, the CPU and GPU addresses are often the same virtual memory address.

    You load data into D3D12_HEAP_TYPE_DEFAULT resource by first copying it into a D3D12_HEAP_TYPE_UPLOAD object via Map on the CPU, then you have to issue a command-list command on the GPU to actually copy the data from there to the D3D12_HEAP_TYPE_DEFAULT resource.

    In the case of constants, these are usually in D3D12_HEAP_TYPE_UPLOAD heaps. While you can use VBs and IBs in these heaps as well, these are really only useful for "usage dynamic" style resources that are updated every frame. On most GPU architectures, it's faster to get these as a "usage static" style resource in D3D12_HEAP_TYPE_DEFAULT. Since constants usually change every frame, it doesn't make sense to put them in D3D12_HEAP_TYPE_DEFAULT.

    Remember that you, the application developer, are responsible for the CPU/GPU synchronization via fences, so you need to make sure you don't change the memory on the CPU while the GPU still needs it. Except for very simple cases (like in this sample where it basically creates one constant buffer resource per backbuffer frame), you usually need some kind of linear allocator memory manager for constants. For an example, see GraphicsMemory in the DirectX Tool Kit for DX12.

    One final issue is that the render target usually has to be in GPU accessible memory that is not accessible by the CPU for performance reasons even on UMA systems. In some cases the GPU actually works in 'tiles' which also has implications for the render target buffer. The purpose of a D3D12_HEAP_TYPE_READBACK heap is to optimize the case where you want to have the GPU write the data from a render target once to a place that the CPU can read but not write.