This non-member function drawPoly()
draws an n-sided polygon in 3D space from a list of vertices.
This function typically gets called thousands of times during normal execution and speed is critical.
Ignoring the effects of the functions called within drawPoly()
, does the allocation of the 25-element vertex array have any negative effects on speed?
void drawPoly(const meshx::Face& face, gen::Vector position,
ALLEGRO_COLOR color, bool filled)
{
ALLEGRO_VERTEX vertList[25];
std::size_t k = 0;
// ...For every vertex in the polygon...
for(; k < face.getNumVerts(); ++k) {
vertList[k].x = position.x + face.alVerts[k].x;
vertList[k].y = position.y + face.alVerts[k].y;
vertList[k].z = position.z + face.alVerts[k].z;
vertList[k].u = 0;
vertList[k].v = 0;
vertList[k].color = color;
}
// Draw with ALLEGRO_VERTEXs and no textures.
if(filled) {
al_draw_prim(vertList, nullptr, nullptr,
0, k, ALLEGRO_PRIM_TRIANGLE_LIST);
} else {
al_draw_prim(vertList, nullptr, nullptr,
0, k, ALLEGRO_PRIM_LINE_LOOP);
}
}
The only way to tell it for sure, is to measure. But what else could you use instead, to compare with? Allocating on the heap would be obviously slower. Using a global variable to hold the vertices could be an option - only for perf benchmarking.
Given that the stack allocation of trivially constructible objects is usually translates to a simple change of the stack pointer, the allocation itself probably wouldn't be a big deal. What could have an observable effect tough, is touching extra cache lines. The less cache lines the code writes, the better, from the performance perspective. Therefore, you can experiment with splitting vertList[25]
into cache line sized arrays, and calling al_draw_prim
multiple times. A benchmark would show if there's a difference.