Search code examples
c++neural-networkpytorchreshapelibtorch

Pytorch C++ (libtorch) outputs different results if I change shape


So I'm learning Neural networks right now and I noticed something really really strange in my network. I have an input layer created like this

convN1 = register_module("convN1", torch::nn::Conv2d(torch::nn::Conv2dOptions(4, 256, 3).padding(1)));

and an output layer that is a tanh function.

So it's expecting a torch::Tensor of shape {/batchSize/, 4, /sideLength/, /sideLength/} which is and will output a tensor of just 1 float value.

As such for testing I have created a custom Tensor of shape {4, 15, 15}.

The really weird part is what's happening below

auto inputTensor = torch::zeros({ 1, 4, 15, 15});
inputTensor[0] = customTensor;
std::cout << network->forward(inputTensor); // Outputs something like 0.94142

inputTensor = torch::zeros({ 32, 4, 15, 15});
inputTensor[0] = customTensor;
std::cout << network->forward(inputTensor); // Outputs something like 0.1234 then 0.8543 31 times

So why is the customTensor getting 2 different values from my network just from the fact that the batchsize has changed? Am I not understanding some parts of how tensors work?

P.S. I did check and the above block of code was operating under eval mode.

Edit: Since it's been asked here's a more indepth look at my network

convN1 = register_module("convN1", torch::nn::Conv2d(torch::nn::Conv2dOptions(4, 256, 3).padding(1)));
batchNorm1 = register_module("batchNorm1", torch::nn::BatchNorm2d(torch::nn::BatchNormOptions(256)));

m_residualBatch1 = register_module(batch1Name, torch::nn::BatchNorm2d(torch::nn::BatchNormOptions(256)));
m_residualBatch2 = register_module(batch2Name, torch::nn::BatchNorm2d(torch::nn::BatchNormOptions(256)));
m_residualConv1 = register_module(conv1Name, torch::nn::Conv2d(torch::nn::Conv2dOptions(256, 256, 3).padding(1)));
m_residualConv2 = register_module(conv2Name, torch::nn::Conv2d(torch::nn::Conv2dOptions(256, 256, 3).padding(1)));

valueN1 = register_module("valueN1", torch::nn::Conv2d(256, 2, 1));
batchNorm3 = register_module("batchNorm3", torch::nn::BatchNorm2d(torch::nn::BatchNormOptions(2)));
valueN2 = register_module("valueN2", torch::nn::Linear(2 * BOARD_LENGTH, 64));
valueN3 = register_module("valueN3", torch::nn::Linear(64, 1));

And how it forwards is like so

torch::Tensor Net::forwadValue(torch::Tensor x)
{
    x = convN1->forward(x);
    x = batchNorm1->forward(x);
    x = torch::relu(x);

    torch::Tensor residualCopy = x.clone();
    x = m_residualConv1->forward(x);
    x = m_residualBatch1->forward(x);
    x = torch::relu(x);
    x = m_residualConv2->forward(x);
    x = m_residualBatch2->forward(x);
    x += residualCopy;
    x = torch::relu(x);

    x = valueN1->forward(x);
    x = batchNorm3->forward(x)
    x = torch::relu(x);
    x = valueN2->forward(x.reshape({ x.sizes()[0], 30 }))
    x = torch::relu(x);
    x = valueN3->forward(x)
    return torch::tanh(x);
}

Solution

  • THANK YOU @MichaelJungo turns out you were right in that one of my BatchNorm2d wasn't being set to eval mode. I was unclear how registering modules worked in the beginning (still am to an extent) so I overloaded the ::train() function to manually set all my modules to the necessary mode.

    In it I forgot to set one of my BatchNorm modules to the correct mode, doing so made it all consistent thanks a bunch!