I try to figure out how to use the ray tracing. What I have is a rasterizer drawed Triangle (most the same as shown by Vulkan tutorials, but refactored into my own code and utility functions). The recreation of the swapchain due to window resizing/minimizing, etc. works correctly.
Then I have added the creation of a bottom acceleration structure. And since then I get some Device Lost errors, but not always... Here is a skeleton of my code:
mainLoop:
- acquireNextImage
- renderWorld
- render gui
- present
acquireNextImage:
- signaling a semaphore "Image Available"
render gui:
- wait for fence "Gui"
- begin command buffer (use graphics queue)
- begin render pass
- bind pipeline (graphics)
- set viewport
- set scissor
- draw (2 triangles in my GUI yet)
- end render pass
- end command buffer
- reset fence "Gui"
- submit command buffer (waiting for semaphore "Image Available", signaling semaphore "Rendering Done", checking fence "Gui")
present:
- queue present (waiting for semaphore "Rendering Done")
renderWorld:
(actually this does not render anything yet, it just creates a vertex buffer, an index buffer, and creates the bottom acceleration strucutre for it - and caches it for further frames, so this code runs only once at start!)
- == vertexbuffer ==
- create buffer (usage transfer dst)
- get buffer memory requirements
- allocate memory (device local)
- bind buffer memory
- create buffer (usage transfer src) (this is my staging buffer)
- get buffer memory requirements
- allocate memory (host visible, host coherent) (this is my staging memory)
- bind buffer memory
- wait for fence "Transfer"
- begin command buffer (use dedicated transfer queue)
- copy buffer
- end command buffer
- reset fence "Transfer"
- submit command buffer (checking fence "Transfer")
- wait queue idle
- free stating memory
- destroy staging buffer
- == indexbuffer ==
- create buffer (usage transfer dst)
- get buffer memory requirements
- allocate memory (device local)
- bind buffer memory
- create buffer (usage transfer src) (this is my staging buffer)
- get buffer memory requirements
- allocate memory (host visible, host coherent) (this is my staging memory)
- bind buffer memory
- wait for fence "Transfer"
- begin command buffer (use dedicated transfer queue)
- copy buffer
- end command buffer
- reset fence "Transfer"
- submit command buffer (checking fence "Transfer")
- wait queue idle
- free stating memory
- destroy staging buffer
- == bottom acceleration structure ==
- get acceleration structure build sizes info
- create buffer (acceleration structure storage, shader device address)
- create acceleration structure (bottom level)
- create buffer (acceleration structure storage, shader device address) (this is my scratch buffer)
- wait for fence "BuildAcc"
- begin command buffer (use dedicated compute queue)
- pipeline barrier (transfer write -> acceleration structure write), don't know if I need it here
- build acceleration structures
- pipeline barrier (acceleration structure write -> shader read), don't know if I need it here
- end command buffer
- reset fence "BuildAcc"
- submit command buffer (checking fence "BuildAcc")
- wait queue idle <------- here device lost
- destroy scratch buffer
So, if I comment out "renderWorld", all works fine. If I let renderWorld be in, then I get a device lost (see the last "wait queue idle" line). But not every time I run the program.
If I put a breakpoint on the "wait queue idle" line, and after the program stops at this line, I can continue the program and all is fine. Also if I comment out the "build acceleration structures" command, all is working (except of course there is no acc structure, but there is no device lost).
So I don't know where the problem is. In my opinion, I need to synchronize something somehow, because with the break point I can run it and it works. So the code must be OK as I understand it.
There is also no validation error except missing shader bindings, but because I do not use them yet (I don't ray trace render yet), it should not be a problem.
After the device lost, the other semaphores are not reset and also my gui render path is not working anymore.
Can somebody tell me if I missed a synchronisation somewhere? And how I can add it? I don't have copied my whole code here, because it would be tons of code. But if you need some piece of code, then ask for it, and I can paste it here.
I've found the problem! This here was the right hint and seams to be the same problem as mine:
https://www.reddit.com/r/vulkan/comments/fqu90h/submitting_command_buffer_to_compute_queue_for/
So I didn't allocate memory for my acceleration buffer, this here:
- == bottom acceleration structure ==
- get acceleration structure build sizes info
- create buffer (acceleration structure storage, shader device address) <---- no memory
Before I noticed the post on reddit I already allocated memory for my scratch buffer, because I thought that it is needed, because I get the device address from it. But somewhere in Web I read about the acceleration structure buffer, that it doesn't need any memory allocation and that this is done behind the scenes. But this is not the case :)
So now my code is working and I can continue to learn and build the top level acceleration structure now :)