Search code examples
openglvertex-buffervertex-array-object

How to minimize glVertexAttribPointer calls when using Instanced Arrays?


I have OpenGL code using one VAO for all model data and two VBOs. The first for standard vertex attributes like position and normal and the second for the model matrices. I am using instanced draw, so I load the model matrices as instanced arrays (which are basically vertex attributes).

First I load the standard vertex attributes to a VBO and setup everything once with glVertexAttribPointer. Then I load the model matrices to another VBO. Now I have to call glVertexAttribPointerin the draw loop. Can I somehow prevent this?

The code looks like this:

// vertex data of all models in one array
GLfloat myvertexdata[myvertexdatasize];

// matrix data of all models in one array
// (one model can have multiple matrices)
GLfloat mymatrixdata[mymatrixsize];

GLuint vao;
glGenVertexArrays(1, &vao);
glBindVertexArray(vao);
GLuint vbo;
glGenBuffers(1, &vbo);
glBindBuffer(GL_ARRAY_BUFFER, vbo);
glBufferData(GL_ARRAY_BUFFER, myvertexdatasize*sizeof(GLfloat), myvertexdata, GL_STATIC_DRAW);

glVertexAttribPointer(
          glGetAttribLocation(myprogram, "position"),
          3,
          GL_FLOAT,
          GL_FALSE,
          24,
          (GLvoid*)0
);
glEnableVertexAttribArray(glGetAttribLocation(myprogram, "position"));
glVertexAttribPointer(
          glGetAttribLocation(myprogram, "normal"),
          3,
          GL_FLOAT,
          GL_FALSE,
          24,
          (GLvoid*)12
);
glEnableVertexAttribArray(glGetAttribLocation(myprogram, "normal"));

GLuint matrixbuffer;
glGenBuffers(1, &matrixbuffer);
glBindBuffer(GL_ARRAY_BUFFER, matrixbuffer);
glBufferData(GL_ARRAY_BUFFER, mymatrixsize*sizeof(GLfloat), mymatrixdata, GL_STATIC_DRAW);

glUseProgram(myprogram);


draw loop:
    int vertices_offset = 0;
    int matrices_offset = 0;
    for each model i:
        GLuint loc = glGetAttribLocation(myprogram, "model_matrix_column_1");
        GLsizei matrixbytes = 4*4*sizeof(GLfloat);
        GLsizei columnbytes = 4*sizeof(GLfloat);
        glVertexAttribPointer(
              loc, 
              4, 
              GL_FLOAT, 
              GL_FALSE, 
              matrixbytes,
              (GLvoid*) (matrices_offset*matrixbytes + 0*columnbytes)
        );
        glEnableVertexAttribArray(loc);
        glVertexAttribDivisor(loc, 1); // matrices are in instanced array
        // do this for the other 3 columns too...

        glDrawArraysInstanced(GL_TRIANGLES, vertices_offset, models[i]->num_vertices(), models[i]->num_instances());

        vertices_offset += models[i]->num_vertices();
        matrices_offset += models[i]->num_matrices();

I thought of the approach of storing vertex data and matrices in one VBO. The problem is then how to set the strides correctly. I couldn't come up with a solution.

Any help would be greatly appreciated.


Solution

  • If you have access to base-instance rendering (requires GL 4.2 or ARB_base_instance), then you could do this. Put the instanced attribute stuff in the setup with the non-instanced attribute stuff:

    GLuint loc = glGetAttribLocation(myprogram, "model_matrix_column_1");
    
    for(int count = 0; count < 4; ++count, ++loc)
    {
        GLsizei matrixbytes = 4*4*sizeof(GLfloat);
        GLsizei columnbytes = 4*sizeof(GLfloat);
        glVertexAttribPointer(
              loc, 
              4, 
              GL_FLOAT, 
              GL_FALSE, 
              matrixbytes,
              (GLvoid*) (count*columnbytes)
        );
        glEnableVertexAttribArray(loc);
        glVertexAttribDivisor(loc, 1); // matrices are in instanced array
    }
    

    Then you just bind the VAO when you're ready to render these models. Your draw call becomes:

    glDrawArraysInstancedBaseInstance​(GL_TRIANGLES, vertices_offset, models[i]->num_vertices(), models[i]->num_instances(), matrix_offset);
    

    This feature is surprisingly widely available, even on pre-GL 4.x hardware (as long as it has recent drivers).

    Without base instance rendering however, there's nothing you can do. You will have to adjust the instance pointers for each new set of instances you want to render. This is in fact why base instance rendering exists.