GLSL - Binding Attributes to Semantics

I've seen a number of questions and answers here on stack overflow about this topic. From these answers, I've come up with a possible solution to binding GLSL Attributes to user-defined Semantics. I wanted to get some input and discussion going about it, and to check if it was even a valid idea.

To start, let's assume we have some list of user-defined semantics:

enum VertexElementSemantic
{
  POSITION, NORMAL, AMBIENT, DIFFUSE, SPECULAR,
  TEX_COORD0, TEX_COORD1, TEX_COORD2, TEX_COORD3,
  INDICES
};

And a structure that encapsulates the data required to set up a vertex attribute pointer.

struct VertexElement
{
  unsigned int m_source;
  unsigned int m_offset;
  unsigned int m_stride;
}

Now, some RenderOperation class will contain a map of VertexElementSemantics to VertexElements. The format, size, and whether the VertexElement is normalized can be determined by its Semantic.

The last bit of information we require in order to set this pointer is the attribute location itself. Here's where we want to bind our VertexElementSemantic to a specific location.

From the first answer to this question, we learn that we can explicitly state the desired location of each attribute like so:

layout(location = 0) in vec3 position;

So we could map our semantics to these hard coded locations, but then we require this location be hard coded in each shader. Any changes to these locations require us to go through and edit every shader.

However, this value does not have to be provided by the Shader source at all. From the answer to this question, we learn that we can externally add #defines to our Shaders like so:

char *sources[2] = { "#define FOO\n", sourceFromFile };
glShaderSourceARB(shader, 2, sources, NULL);

Using this, we could build a string that #defines variables for the desired locations of each semantic. For example, we could build a string that will end up inserting the following to the beginning of each of our Shaders:

#define POSITION_LOCATION 0
#define NORMAL_LOCATION 1
#define AMBIENT_LOCATION 2
...

Going back to explicitly stating our attribute locations, we should be able to state them as such now:

layout(location = POSITION_LOCATION) in vec3 position;
layout(location = NORMAL_LOCATION) in vec3 normal;
layout(location = AMBIENT_LOCATION) in vec4 ambient;

This method would allow us to set the desired attribute location of each Semantic in code. It also provides a sort of semantic binding feel to the Shaders themselves. Is a system like this a step in the right direction for solving the issue of providing meaning to attribute locations?

Solution

Let's consider the consequences of this idea.

we could build a string that #defines variables for the desired locations of each semantic. For example, we could build a string that will end up inserting the following to the beginning of each of our Shaders:

Well, this is bad on two counts. First, there's the #version issue. If you want to use any GLSL version other than 1.10, you must provide a #version declaration. And that declaration must be the first thing in the shader, outside of comments and whitespace.

By putting these #defines into your shader source (whether by string concatenation, or by using multiple strings as you do), then you have to accept certain consequences. Usually, each individual shader file will have its own #version declaration, specifying what GLSL version it uses. But you can't do that if you want to use something besides GLSL 1.10. You have to have your C++ source code generate the #version, before your #defines.

This means that your shader source is now decoupled from the version it compiles under. Which is doable, but it means that your shader source is now unclear without knowing what version it is. You could communicate the version in some other way, such as with a filename (for example, lit_transform_330.vert would use version 3.30). But you'll have to devise such a system.

Now that the version issue is sorted out, on to the next problem: what you are doing is redundant.

You use terms like "semantic", which have no meaning to OpenGL. It appears that you're trying to assign some form of name to a particular vertex attribute, so that you can see uses of that name in your shader and C++ code, and therefore know what attribute it is for.

That is, you want to define a mapping between "name" and "attribute index". You want it defined in one place, such that it is automagically propagated to every shader and used consistently throughout your C++ source code.

Well we already have a mapping between a name and the attribute index. It's called "the mapping between the attribute's name and the attribute's index". Every shader must provide a name for it's attributes. That's the string name you see in definitions like in vec4 position; the attribute's name is position. That's what GLSL calls the variable when it uses it.

As stated in the answer you linked to, you can associate a particular attribute name with an attribute index from C++ code before the program is linked. This is done via the glBindAttribLocation function. You can set any number of mappings you like. When the program is linked, attributes that match to a specified location will be assigned that location.

All you need is a list of your "semantics" (aka: attribute indices) and the string names you require shaders to use for those attributes.

You might say, "Well, I want shaders to have the freedom to call the variable whatever the want." My response would be... what's the difference? Your suggested scheme already requires the user to adhere to a specific naming convention. It's just that the name they must use isn't the variable's name; it's the name of some tag you associate with the variable at declaration time.

So what exactly is the difference? That the writer of a shader has to adhere to a set naming scheme for vertex attribute variable name? Isn't a consistent name for the same concept across all shaders a good thing?

The one difference is that, if they mistype the "semantic" under your scheme, they get a shader compilation error (since their mistyped "semantic" name won't match any actual #defines). Whereas if they mistype the name of an attribute, they will only get a compiler error if they don't mistype that name when they use the attribute.

There are ways to catch that. It requires using program introspection to walk the list of active attributes and checking them against the expect names of attributes.

You can boil this down to a very simple set of conventions. Using your "semantic" definition:

enum VertexElementSemantic
{
  POSITION, NORMAL, AMBIENT, DIFFUSE, SPECULAR,
  TEX_COORD0, TEX_COORD1, TEX_COORD2, TEX_COORD3,
  INDICES, NUM_SEMANTICS
};

//in the C++ file you use to link your shaders
const char *AttributeNames[] =
{
  "position", "normal", "ambient", "diffuse", "specular", 
  "tex_coord0", "tex_coord1", "tex_coord2", "tex_coord3", 
  "indices",
}

static_assert(ARRAY_COUNT(AttributeNames) == NUM_SEMANTICS); //Where `ARRAY_COUNT` is a macro that computes the number of elements in a static array.

GLuint CreateProgram(GLuint vertexShader, GLuint fragmentShader)
{
  GLuint prog = glCreateProgram();
  //Attach shaders
  for(int attrib = 0; attrib < NUM_SEMANTICS; ++attrib)
  {
    glBindAttribLocation(prog, attrib, AttributeNames[attrib]);
  }

  glLinkProgram(prog);

  //Detach shaders
  //Check for linking errors

  //Verify that attribute locations are as expected.
  //Left as an exercise for the reader.

  return prog;
}

Personally speaking, I would just use a number. No matter what you use, the person writing the shader is going to have to adhere to some convention. Which means when they go to write a vertex shader that takes a position, they're going to have to look up how to say "this is a position". So they're going to have to look something up in a table somewhere no matter what.

At which point, it comes down to what the most likely problem is. The most likely issues would be either someone who thinks they know the answer but is actually wrong (ie: didn't look it up), and someone who mistyped the answer. It's really hard to mistype a number (though it certainly can happen), while it's much easier to mistype POSITION_LOCATION. The former problem could happen to either one in more or less equal numbers.

So it seems to me that you're more likely to get fewer convention mis-match problems if your convention is based on numbers rather than words.