Search code examples
swiftaugmented-realityscenekitarkitrealitykit

Understanding ARKit World Transform Matrices


In ARKit, when I perform a hit-test, I get back an instance of ARHitTestResult. One of the properties of this is worldTransform, which I understand contains a 4x4 transformation matrix of the position of the object – simd_float4x4.

As someone who is very unfamiliar with linear algebra and 3D graphics, how would I edit this matrix to, say, increase its y coordinate by 0.05?

If there is a blog post or something I could look at that would help me wrap my head around this, please let me know. I am not sure what terms I should be googling.

Sorry if my question is full of misunderstandings! As you can probably tell, I am not too familiar with these concepts.

Thank you to anyone who helps.


Solution

  • EDIT: The original question is best addressed by just adding 0.05 to the y component of the node's position. However, the original answer below does address a bit about composing transformation matrices, if that is something you are interested in.

    ======================================================================

    If you want to apply an operation to a matrix, the most immediately simple way is to make a matrix that does that operation, and then multiply your original matrix by that new matrix.

    For a translation, assuming you want to translate by x, y, z, you can do this:

    let translation = simd_float4x4(
        float4(1, 0, 0, 0),
        float4(0, 1, 0, 0),
        float4(0, 0, 1, 0),
        float4(x, y, z, 1)
    )
    

    Note that this is just an identity matrix (1 down the diagonal) with the last column (!!!important, the float4s above are COLUMNS, not ROWS, as they would visually seem) set to contain the x/y/z values. You can research further into homogeneous coordinates, but think of this as just how a translation is represented.

    Then, in simd, just do this: let newWorldTransform = translation * oldWorldTransform and you will have the old world transform translated by your x/y/z translation values (in your example, [x, y, z] = [0, 0.05, 0]).

    However, it may be worth exploring why you want to edit your hit test results. I cannot think of a practical use case for that, so maybe if you explain a bit more about what you are trying to do I could suggest a more intuitive way to do it.