Search code examples
c++mathopenglmatrix

Confusion between C++ and OpenGL matrix order (row-major vs column-major)


I'm getting thoroughly confused over matrix definitions. I have a matrix class, which holds a float[16] which I assumed is row-major, based on the following observations:

float matrixA[16] = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15 };
float matrixB[4][4] = { { 0, 1, 2, 3 }, { 4, 5, 6, 7 }, { 8, 9, 10, 11 }, { 12, 13, 14, 15 } };

matrixA and matrixB both have the same linear layout in memory (i.e. all numbers are in order). According to http://en.wikipedia.org/wiki/Row-major_order this indicates a row-major layout.

matrixA[0] == matrixB[0][0];
matrixA[3] == matrixB[0][3];
matrixA[4] == matrixB[1][0];
matrixA[7] == matrixB[1][3];

Therefore, matrixB[0] = row 0, matrixB[1] = row 1, etc. Again, this indicates row-major layout.

My problem / confusion comes when I create a translation matrix which looks like:

1, 0, 0, transX
0, 1, 0, transY
0, 0, 1, transZ
0, 0, 0, 1

Which is laid out in memory as, { 1, 0, 0, transX, 0, 1, 0, transY, 0, 0, 1, transZ, 0, 0, 0, 1 }.

Then when I call glUniformMatrix4fv, I need to set the transpose flag to GL_FALSE, indicating that it's column-major, else transforms such as translate / scale etc don't get applied correctly:

If transpose is GL_FALSE, each matrix is assumed to be supplied in column major order. If transpose is GL_TRUE, each matrix is assumed to be supplied in row major order.

Why does my matrix, which appears to be row-major, need to be passed to OpenGL as column-major?


Solution

  • matrix notation used in opengl documentation does not describe in-memory layout for OpenGL matrices

    If think it'll be easier if you drop/forget about the entire "row/column-major" thing. That's because in addition to row/column major, the programmer can also decide how he would want to lay out the matrix in the memory (whether adjacent elements form rows or columns), in addition to the notation, which adds to confusion.

    OpenGL matrices have same memory layout as directx matrices.

    x.x x.y x.z 0
    y.x y.y y.z 0
    z.x z.y z.z 0
    p.x p.y p.z 1
    

    or

    { x.x x.y x.z 0 y.x y.y y.z 0 z.x z.y z.z 0 p.x p.y p.z 1 }
    
    • x, y, z are 3-component vectors describing the matrix coordinate system (local coordinate system within relative to the global coordinate system).

    • p is a 3-component vector describing the origin of matrix coordinate system.

    Which means that the translation matrix should be laid out in memory like this:

    { 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, transX, transY, transZ, 1 }.
    

    Leave it at that, and the rest should be easy.

    ---citation from old opengl faq--


    9.005 Are OpenGL matrices column-major or row-major?

    For programming purposes, OpenGL matrices are 16-value arrays with base vectors laid out contiguously in memory. The translation components occupy the 13th, 14th, and 15th elements of the 16-element matrix, where indices are numbered from 1 to 16 as described in section 2.11.2 of the OpenGL 2.1 Specification.

    Column-major versus row-major is purely a notational convention. Note that post-multiplying with column-major matrices produces the same result as pre-multiplying with row-major matrices. The OpenGL Specification and the OpenGL Reference Manual both use column-major notation. You can use any notation, as long as it's clearly stated.

    Sadly, the use of column-major format in the spec and blue book has resulted in endless confusion in the OpenGL programming community. Column-major notation suggests that matrices are not laid out in memory as a programmer would expect.


    I'm going to update this 9 years old answer.

    A mathematical matrix is defined as m x n matrix. Where m is a number of rows and n is number of columns. For the sake of completeness, rows are horizontals, columns are vertical. When denoting a matrix element in mathematical notation Mij, the first element (i) is a row index, the second one (j) is a column index. When two matrices are multiplied, i.e. A(m x n) * B(m1 x n1), the resulting matrix has number of rows from the first argument(A), and number of columns of the second(B), and number of columns of the first argument (A) must match number of rows of the second (B). so n == m1. Clear so far, yes?

    Now, regarding in-memory layout. You can store matrix two ways. Row-major and column-major. Row-major means that effectively you have rows laid out one after another, linearly. So, elements go from left to right, row after row. Kinda like english text. Column-major means that effectively you have columns laid out one after another, linearly. So elements start at top left, and go from top to bottom.

    Example:

    //matrix
    |a11 a12 a13|
    |a21 a22 a23|
    |a31 a32 a33|
    
    //row-major
    [a11 a12 a13 a21 a22 a23 a31 a32 a33]
    
     //column-major
    [a11 a21 a31 a12 a22 a32 a13 a23 a33]
    

    Now, here's the fun part!

    There are two ways to store 3d transformation in a matrix. As I mentioned before, a matrix in 3d essentially stores coordinate system basis vectors and position. So, you can store those vectors in rows or in columns of a matrix. When they're stored as columns, you multiply a matrix with a column vector. Like this.

    //convention #1
    |vx.x vy.x vz.x pos.x|   |p.x|   |res.x|
    |vx.y vy.y vz.y pos.y|   |p.y|   |res.y|
    |vx.z vy.z vz.z pos.z| x |p.z| = |res.z|
    |   0    0    0     1|   |  1|   |res.w| 
    

    However, you can also store those vectors as rows, and then you'll be multiplying a row vector with a matrix:

    //convention #2 (uncommon)
                      | vx.x  vx.y  vx.z 0|   
                      | vy.x  vy.y  vy.z 0|   
    |p.x p.y p.z 1| x | vz.x  vz.y  vz.z 0| = |res.x res.y res.z res.w|
                      |pos.x pos.y pos.z 1|   
    

    So. Convention #1 often appears in mathematical texts. Convention #2 appeared in DirectX sdk at some point. Both are valid.

    And in regards of the question, if you're using convention #1, then your matrices are column-major. And if you're using convention #2, then they're row major. However, memory layout is the same in both cases

    [vx.x vx.y vx.z 0 vy.x vy.y vy.z 0 vz.x vz.y vz.z 0 pos.x pos.y pos.z 1]
    

    Which is why I said it is easier to memorize which element is which, 9 years ago.