Search code examples
pythontypestensortorch

Is there a difference between torch.IntTensor and torch.Tensor


When using PyTorch tensors, is there a point to initialize my data like so:

X_tensor: torch.IntTensor = torch.IntTensor(X)
Y_tensor: torch.IntTensor = torch.IntTensor(Y)

Or should I just do the 'standard':

X_tensor: torch.Tensor = torch.Tensor(X)
Y_tensor: torch.Tensor = torch.Tensor(Y)

even though I know X: list[list[int] and Y: list[list[int]


Solution

  • Using torch.IntTensor() or torch.Tensor() you end up with

    • either a tensor that can hold signed integer values and requires 32-bits per value
    • or a tensor that can hold 32-bit floating-point numbers as torch.Tensor returns (respect. is alias for) a torch.FloatTensor.

    Using torch.tensor(X) (with only integers in X) on the other hand will lead 64-bit integer tensor by default as torch.tensor() infers the data type automatically.

    import torch
    
    X = [[1, 2], [3, 4]]
    
    x1 = torch.IntTensor(X)
    x2 = torch.Tensor(X)
    x3 = torch.tensor(X)
    
    print(x1.dtype)  # torch.int32
    print(x2.dtype)  # torch.float32
    print(x3.dtype)  # torch.int64
    

    What you need, depends on what you want to do with the data. For computations in neural networks usually tensors with 32-bit floating-point precision are used.

    That said, pytorch automatically converts data to the larger type, if data types are mixed within calculations. So this works:

    c = 3.1
    print(x1*c, (x1*c).dtype)  # tensor([[ 3.1000,  6.2000], [ 9.3000, 12.4000]]) torch.float32
    print(x2*c, (x2*c).dtype)  # tensor([[ 3.1000,  6.2000], [ 9.3000, 12.4000]]) torch.float32
    

    But this also works (though the result is "wrong" (rounded) - so better directly start with the precision required).

    data_float32 = torch.tensor([0.1, 0.2, 0.3])
    data_int16 = data_float32.to(torch.short)
    data_squared = data_float32 * data_int16
    print(data_squared, data_squared.dtype)  # tensor([0., 0., 0.]) torch.float32