Search code examples
pytorcheinops

einops rearrange vs torch view


I am currently implementing the LoFTR model and came across the following code:

feature_c0.shape
-> torch.Size([1, 256, 60, 60])

rearrange(feature_c0, 'n c h w -> n (h w) c').shape
-> torch.Size([1, 3600, 256])

feature_c0.view(1, -1, 256).shape
-> torch.Size([1, 3600, 256])

I thought I understood the functionality of both, tensor.view and rearrange. The problem: the output of these 2 is different, even if their shape is the same. I don't really understand what is going on here.


Solution

  • The torch.view automatically reshape the inner dimension to fit the output dimension especially using -1 index.

    For example,

    x = torch.arange(24)
    x = x.view(1, 2, 3, 4)
    >
    tensor([[[[ 0,  1,  2,  3],
              [ 4,  5,  6,  7],
              [ 8,  9, 10, 11]],
    
             [[12, 13, 14, 15],
              [16, 17, 18, 19],
              [20, 21, 22, 23]]]])
    
    x_res = x.view(1, -1, 6) # x_res.shape = [1, 4, 6]
    >
    tensor([[[ 0,  1,  2,  3,  4,  5],
             [ 6,  7,  8,  9, 10, 11],
             [12, 13, 14, 15, 16, 17],
             [18, 19, 20, 21, 22, 23]]])
    
    x_res  = rearrange(x, 'a b c d -> a (specified_b) specified_c') # raise error!
    

    using tensor.view() is still possible to reshape to "last_dimension=6" with the order of tail tensor, while rearrange() should involve specified dimension to be reshaped, divided or grouped.
    In your case, the 256 * 60 * 60 is somehow grouped into [x * 256] in the order of last dimension, not [(60*60) * 256] you wanted.

    As a result, rearrange is more specified function in your case.