Search code examples
pythontensorflowpytorchtorch

Equivalent of tf.keras.layers.Input in pytorch


What is the equivalent of

ops = tf.keras.layers.Input(
    shape=[hilbert_size, hilbert_size, num_points * 2], name="operators"
)
inputs = tf.keras.Input(shape=(num_points), name="inputs")

from tensorflow in pytorch?

I'm writing a cGAN in torch from a code that I wrote in tensorflow. It's kind of a translation. What I did in torch is:

    ops = nn.Parameter(torch.empty(1, hilbert_size, hilbert_size, num_points * 2))
    inputs = torch.empty((1, 1296), requires_grad=True)

The outputs in tensorflow are:

In [59]: ops
Out[59]: <KerasTensor: shape=(None, 16, 16, 2592) dtype=float64 (created by layer 'operators')>

In [60]: inputs
Out[60]: <KerasTensor: shape=(None, 1296) dtype=float64 (created by layer 'inputs')>

But the code in torch gives:

In [6]: ops
Out[6]: 
Parameter containing:
tensor([[[[0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      ...,
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.]],

     [[0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      ...,
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.]],

     [[0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      ...,
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.]],

     ...,

     [[0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      ...,
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.]],

     [[0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      ...,
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.]],

     [[0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      ...,
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.],
      [0., 0., 0.,  ..., 0., 0., 0.]]]], requires_grad=True)

In [7]: inputs
Out[7]: 
tensor([[-1.8727e+27,  4.5817e-41,  1.3275e-17,  ...,  4.5817e-41,
     -7.5189e+25,  4.5817e-41]], requires_grad=True)

Is it right? Is this the correct way of instantiate inputs for a neural network in torch? It makes sense to me that the inputs is a vector of shape torch.Size([1, 1296]) but the corresponding code in tensorflow gives TensorShape([None, 1296]) and the Out[60] line in the code above gives as output an object that is a layer and not a tensor like the output from torch. I'm not sure if I'm going the right tway


Solution

  • Your observations are correct regarding the differences between TensorFlow and PyTorch, more specifically:

    1. TensorFlow's Input vs. PyTorch's tensor:

      • In TensorFlow, tf.keras.layers.Input defines the shape of the input placeholder but does not hold any data itself. It's more like defining the shape and type of the input that a model will expect. It returns a KerasTensor object, which is a symbolic representation, and you can think of it as a placeholder for the actual data.
      • In PyTorch, tensors are multi-dimensional arrays that can hold actual data. Therefore, when you define a tensor like torch.empty((1, 1296), requires_grad=True), you're creating an actual tensor with uninitialized data of shape [1, 1296].
    2. Differences in Batch Dimension:

      • TensorFlow's Input uses None for the batch dimension to indicate that the batch size is dynamic and can change.
      • In PyTorch, if you define a tensor of shape [1, 1296], it means the batch size is fixed to 1. In most cases, during the actual training or inference, you'll pass data with variable batch sizes.
    3. Instantiating Inputs in PyTorch:

      • The way you've instantiated inputs using torch.empty is one way to do it, but it creates an uninitialized tensor. If you want it to be initialized with zeros (similar to TensorFlow's default), you can use torch.zeros instead.
      • The nn.Parameter is generally used for tensors that should be optimized, like weights and biases in a model. If ops should be optimized during training (like model weights), then using nn.Parameter is correct.

    Given your context, here's a closer match to the TensorFlow code:

    ops = torch.zeros(1, hilbert_size, hilbert_size, num_points * 2)
    # If ops should be optimized during training:
    ops = nn.Parameter(ops)
    
    inputs = torch.zeros(1, 1296, requires_grad=True)
    

    Keep in mind:

    • In real training or inference scenarios, you'll probably use the model's forward method to pass inputs, which means you'll create and pass tensors of the appropriate shape and type as needed.
    • If ops is supposed to be a parameter of the model (something you want the optimizer to update), then using nn.Parameter is correct. If not, you should treat it like any other tensor input.
    • You should ensure that tensors are initialized appropriately before use. Using uninitialized tensors like with torch.empty can lead to unpredictable results if the data isn't subsequently overwritten with valid values.