Search code examples
c++openglopengl-3vertex-array-objectgeometry-instancing

Inconsistent behavior in instance rendering with glDrawElementsInstanced, somtimes no rendering with no errors


I've been working on project using OpenGL. Particles are rendered using instanced draw calls. enter image description here

The issue is that sometimes glDrawElementsInstanced will not render anything. And no errors are reported. Other models and effects render fine. But no particles in my particle system will render. The draw call looks something like

ec(glBindVertexArray(vao));
ec(glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ebo));
ec(glDrawElementsInstanced(GL_TRIANGLES, triangleElementIndices.size(), GL_UNSIGNED_INT, reinterpret_cast<void*>(0), instanceCount));

ec is a macro used to error check opengl. it effectively does this:

while (GLenum error = glGetError()){
    std::cerr << "OpenGLError:" << std::hex << error << " " << functionName << " " << file << " " << line << std::endl;
}

The issue rendering particles is more prevalent in Release mode, rather than debug mode; but occurs in both modes. The issue occurs about 8/10 in release mode and 1/10 in debug mode.

Below is the rendering process for particles: for each instanced drawcall...

  1. bind a shared vertex buffer object(vbo)
  2. put data into that vertex buffer object (vbo)
  3. iterate over many vertex array objects (vao), associate the VBO with them and set up vertex attributes
  4. render each vao

All of the objects share the same VBO, but the are rendered sequentially. The entire application is currently single threaded, so that shouldn't be an issue.

A given frame for particles A (two vaos), and B(one vao) would be like:

  • -buffer A's data into vertex buffer named VBO
  • -bind A_vao1
  • -set up A's instance vertex attributes
  • -bind A_vao2
  • -set up A's instance vertex attributes
  • -render A_vao1
  • -render A_vao2
  • -buffer B's data into vertex buffer name VBO (no glGenBuffers, this is same buffer)
  • -bind B_vao1
  • -set up B's instance vertex attributes
  • -render B_vao1

Is there an obvious problem with that approach?

The source below has been simplified, but I left most of the relevant parts. Unlike what I have above, it actually uses 2 shared vertex buffer objects (VBOs), one for matrix4s, and one for vector4s.

GLuint instanceMat4VBO = ...     //valid created vertex buffer objects
GLuint instanceVec4VBO = ...     //valid created vertex buffer objects

//iterate over all the instnaces; data is stored in class EffectInstanceData
for(EffectInstanceData& eid : instancedEffectsData) 
{
    if (eid.numInstancesThisFrame > 0) 
    {
        // ---- BUFFER data ---- before binding it to all VAOs (model's may have multiple meshes, each with their own VAO)
        ec(glBindBuffer(GL_ARRAY_BUFFER, instanceMac4VBO)); //BUFFER MAT4 INSTANCE DATA
        ec(glBufferData(GL_ARRAY_BUFFER, sizeof(glm::mat4) * eid.mat4Data.size(), &eid.mat4Data[0], GL_STATIC_DRAW));

        ec(glBindBuffer(GL_ARRAY_BUFFER, instanceVec4VBO)); //BUFFER VEC4 INSTANCE DATA
        ec(glBufferData(GL_ARRAY_BUFFER, sizeof(glm::vec4) * eid.vec4Data.size(), &eid.vec4Data[0], GL_STATIC_DRAW));

        //meshes may have multiple VAO's that need rendering, set up buffers with instance data for each VAO before instance rendering is done
        for (GLuint effectVAO : eid.effectData->mesh->getVAOs())
        {
            ec(glBindVertexArray(effectVAO));

            { //set up mat4 buffer

                ec(glBindBuffer(GL_ARRAY_BUFFER, instanceMat4VBO));
                GLsizei numVec4AttribsInBuffer = 4 * eid.numMat4PerInstance;
                size_t packagedVec4Idx_matbuffer = 0;

                //pass built-in data into instanced array vertex attribute
                {
                    //mat4 (these take 4 separate vec4s)
                    {
                        //model matrix
                        ec(glEnableVertexAttribArray(8));
                        ec(glEnableVertexAttribArray(9));
                        ec(glEnableVertexAttribArray(10));
                        ec(glEnableVertexAttribArray(11));

                        ec(glVertexAttribPointer(8, 4, GL_FLOAT, GL_FALSE, numVec4AttribsInBuffer * sizeof(glm::vec4), reinterpret_cast<void*>(packagedVec4Idx_matbuffer++ * sizeof(glm::vec4))));
                        ec(glVertexAttribPointer(9, 4, GL_FLOAT, GL_FALSE, numVec4AttribsInBuffer * sizeof(glm::vec4), reinterpret_cast<void*>(packagedVec4Idx_matbuffer++ * sizeof(glm::vec4))));
                        ec(glVertexAttribPointer(10, 4, GL_FLOAT, GL_FALSE, numVec4AttribsInBuffer * sizeof(glm::vec4), reinterpret_cast<void*>(packagedVec4Idx_matbuffer++ * sizeof(glm::vec4))));
                        ec(glVertexAttribPointer(11, 4, GL_FLOAT, GL_FALSE, numVec4AttribsInBuffer * sizeof(glm::vec4), reinterpret_cast<void*>(packagedVec4Idx_matbuffer++ * sizeof(glm::vec4))));

                        ec(glVertexAttribDivisor(8, 1));
                        ec(glVertexAttribDivisor(9, 1));
                        ec(glVertexAttribDivisor(10, 1));
                        ec(glVertexAttribDivisor(11, 1));
                    }
                }
            }

            { //set up vec4 buffer
                ec(glBindBuffer(GL_ARRAY_BUFFER, instanceVec4VBO));

                GLsizei numVec4AttribsInBuffer = eid.numVec4PerInstance; 
                size_t packagedVec4Idx_v4buffer = 0;
                {
                    //package built-in vec4s
                    ec(glEnableVertexAttribArray(7));
                    ec(glVertexAttribPointer(7, 4, GL_FLOAT, GL_FALSE, numVec4AttribsInBuffer * sizeof(glm::vec4), reinterpret_cast<void*>(packagedVec4Idx_v4buffer++ * sizeof(glm::vec4))));
                    ec(glVertexAttribDivisor(7, 1));
                }
            }
        }

        //activate shader
        ... code setting uniforms on shaders, does not appear to be issue...

        //instanced render
        for (GLuint vao : eid.effectData->mesh->getVAOs()) //this actually results in function calls to a mesh class instances, but effectively is doing this loop
        {
            ec(glBindVertexArray(vao));
            ec(glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, ebo));
            ec(glDrawElementsInstanced(GL_TRIANGLES, triangleElementIndices.size(), GL_UNSIGNED_INT, reinterpret_cast<void*>(0), instanceCount));
        }

        //clear data for next frame
        eid.clearFrameData();
    }
}


ec(glBindVertexArray(0));//unbind VAO's

Is any of this visibility wrong? I've debugged with RenderDoc and when the issue is not present, a draw call is present in the event browser like the image:

enter image description here

But when the issue does happen, the draw call does not appear at all in RenderDoc like the following image:

enter image description here

This seems very strange to me. I've verified with the debugger that the draw call is being executed. But it seems to silently fail.

I've tried debugging with nvidia nsight, but cannot reproduce it when launched through nvidia nsight.

I've verified

  • instance VBO buffer size doesn't change or grow too large, its size is stable
  • uniforms are be correctly finding values
  • vao binding appears to happen in correct orderings

System specs: windows 10; Opengl3.3, 8gb memory; i7-8700k, NVIDIA GeForce GTX TITAN X

Also observed issue on on my laptop, with roughly same reproduction rates. It has an intel graphics chip.

github link to actual source if anyone tries to compile let me know, you need to replace the hidden .suo with the copy I made to automatically fill out the linker settings. function: ParticleSystem::handlePostRender


Solution

  • It turns out this isn't an issue with instancing. I implemented a non-instance version and had the same issue. The real issue is with my rendering systems. Currently the swap buffer and the render particles are listening to the same delegate (event) and occasionally the swap buffers will come first when the event broadcasts. So the ordering was:

    1. clear screen
    2. render scene
    3. swap buffers
    4. render particles
    5. clear screen
    6. render scene
    7. swap buffers
    8. render particles

    So, the particles were never visible because they were immediately cleared at what was supposed to be the start of the next frame.