A program that I'm writing contained 10 global function pointers. I then decided to put them in a struct to see if that would change the machine code generated after calling two of the functions. I was surprised to see that the version of the code that utilizes a structure contained two more move instructions than the version of the code without the struct (all the other lines of the disassembly were the same). Is this some kind of strange optimization, or is did does the compiler not know how to eliminate the overhead of a struct member call? I'm using Clang 3.8 as my compiler and am compiling for x86.
Version With Struct:
#include GLES2/gl2.h
struct GLES2FunctionPointers {
const PFNGLCLEARCOLORPROC glClearColor;
const PFNGLCREATEPROGRAMPROC glCreateProgram;
};
struct GLES2FunctionPointers GLES2 = {
.glCreateProgram =
(PFNGLCREATEPROGRAMPROC)eglGetProcAddress("glCreateProgram"),
.glCreateShader =
(PFNGLCREATESHADERPROC)eglGetProcAddress("glCreateProgram"),
};
GL_APICALL GLuint GL_APIENTRY glCreateShader(GLenum type) {
return GLES2.glCreateShader(type);
}
GL_APICALL void GL_APIENTRY glShaderSource(GLuint shader, GLsizei count, const
GLchar *const*string, const GLint *length) {
GLES2.glShaderSource(shader, count, string, length);
}
Version Without Struct:
const PFNGLCREATESHADERPROC glCreateShaderPointer = (PFNGLCREATESHADERPROC)eglGetProcAddress("glCreateShader");
GL_APICALL GLuint GL_APIENTRY glCreateShader(GLenum type) {
return glCreateShaderPointer(type);
}
const PFNGLSHADERSOURCEPROC glShaderSourcePointer =
(PFNGLSHADERSOURCEPROC)eglGetProcAddress("glCreateProgram");
GL_APICALL void GL_APIENTRY glShaderSource(GLuint shader, GLsizei count,
const GLchar *const*string, const GLint *length) {
glShaderSourcePointer(shader, count, string, length);
}
This is the function being disassembled:
int prepareShader(GLuint shaderType, const char * shaderCode) {
GLuint shader = glCreateShader(shaderType);
int len = strlen(shaderCode);
glShaderSource(shader, 1, &shaderCode, &len);
return shader;
}
This is the function call in main:
int vertexShader = prepareShader(GL_VERTEX_SHADER, VERTEX_SHADER);
//VERTEX_SHADER is a string in my code
Struct by themselves have no overhead. const
structs should get optimized to direct calls:
If the struct isn't const, the pointer values will need to get loaded first. The object code output for the symbols may also slightly differ if padding is needed (if I try it with ten functions, call1
gets a xchg %ax %ax
== nop
at the end).