OpenGL packing vertex attributes

In OpenGL, is it better keep vertex attributes seperate:

layout(location = 0) in vec4 v_Position;
layout(location = 1) in vec3 v_Normal;
layout(location = 2) in vec3 v_Tanget;
layout(location = 3) in vec3 v_Bitanget;
layout(location = 4) in vec2 v_UV;

Or is packing them like so:

layout(location = 0) in vec4 v_Position;
layout(location = 1) in vec3 v_Normal;
layout(location = 2) in vec4 v_TangetAndU;
layout(location = 3) in vec4 v_BitangetAndV;

..commonly used as a performance optimization? I was under the impression that if you are bound in performance by the amount of geometry, you may be able to get an extra 20% vertices out of the "packed" version. Is this correct?

Solution

The principle cost of vertex fetching will be the cost of reading memory. The bigger your data is, the more time it takes to read. As such, this kind of packing is not particularly helpful. It would ultimately be better to properly pack the data using normalized integers with your vertex formats.

You can usually get away with using 16-bit unsigned normalized integers for your texture coordinates. This makes your texture coordinates take up 4 bytes per vertex:

glVertexAttribFormat(4, 2, GL_UNSIGNED_SHORT, GL_TRUE, ...);

Your normals/tangents/bitangents should use GL_INT_2_10_10_10_REV, which packs the entire normal into 32-bits. The XYZ get 10 bits each, with the last 2 bits going to an W component that you won't use. So the normals/tangents/bitangents in total will take up 12-bytes per vertex:

glVertexAttribFormat(1, 4, GL_INT_2_10_10_10_REV, GL_TRUE, ...);
glVertexAttribFormat(2, 4, GL_INT_2_10_10_10_REV, GL_TRUE, ...);
glVertexAttribFormat(3, 4, GL_INT_2_10_10_10_REV, GL_TRUE, ...);

Even if you leave your position as 3 floats (no need to pass a fourth), the total size of a vertex will be 12 + 12 + 4 = 28 bytes per vertex. a substantial improvement over either version of your original code. If you use 16-bit floats for the position, you can get it down to 24 bytes per vertex (attributes should always start on 4-byte boundaries).

Note that trying to pack the UV into the tangent/bitangent wouldn't work with the 10/10/10/2 format, since 2 bites is hardly enough for a texture coordinate.

Packing such data, particularly the 10/10/10/2 format, requires some care, but overall, this will be far better in the long run than playing games with the in-shader attributes.