Search code examples
pythonpytorch

TensorDataset gives error 'int' object is not callable, but all inputs are numpy arrays


I'm working on translating a TensorFlow project to PyTorch, but I've run into a strange error. When I try to create a TensorDataset like this

train_data = TensorDataset(
    self.x_train[idx_train], 
    self.covariates_train[idx_train],
    self.y_train[idx_train]
)

torch throws TypeError: 'int' object is not callable. The corresponding TensorFlow code works. I also checked the inputs, and they are all of the correct shape. Specifically, each argument is of the following shape:

Argument 1:  (200, 1, 40, 26) (180,) (180, 1, 40, 26)
Argument 2:  (200, 2) (180, 2)
Argument 3:  (200, 51) (180, 51)

These shapes match the corresponding TensorFlow code at the analogous point in the code (i.e., when I'm about to feed these same inputs into tf.data.Dataset.from_tensor_slices).


Solution

  • You should cast the inputs to torch.Tensor when using torch.utils.data.TensorDataset:

    import torch
    import numpy as np
    
    x_train = np.random.uniform(size=(200, 1, 40, 26))
    y_train = np.random.uniform(size=(200, 51))
    
    dataset = torch.utils.data.TensorDataset(
        torch.from_numpy(x_train),
        torch.from_numpy(y_train)
    )
    

    That's not required with tf.data.Dataset.from_tensor_slices which supports different data types in addition to tf.Tensor.