neural-network conv-neural-network pytorch convolution torch

Performing Convolution (NOT cross-correlation) in pytorch

I have a network that I am trying to implement in pytorch, and I cannot seem to figure out how to implement "pure" convolution. In tensorflow it could be accomplished like this:

def conv2d_flipkernel(x, k, name=None):
    return tf.nn.conv2d(x, flipkernel(k), name=name,
                        strides=(1, 1, 1, 1), padding='SAME')

With the flipkernel function being:

def flipkernel(kern):
      return kern[(slice(None, None, -1),) * 2 + (slice(None), slice(None))]

How can something similar be done in pytorch?

Solution

TLDR Use the convolution from the functional toolbox, torch.nn.fuctional.conv2d, not torch.nn.Conv2d, and flip your filter around the vertical and horizontal axis.

torch.nn.Conv2d is a convolutional layer for a network. Because weights are learned, it does not matter if it is implemented using cross-correlation, because the network will simply learn a mirrored version of the kernel (Thanks @etarion for this clarification).

torch.nn.fuctional.conv2d performs convolution with the inputs and weights provided as arguments, similar to the tensorflow function in your example. I wrote a simple test to determine whether, like the tensorflow function, it is actually performing cross-correlation and it is necessary to flip the filter for correct convolutional results.

import torch
import torch.nn.functional as F
import torch.autograd as autograd
import numpy as np

#A vertical edge detection filter. 
#Because this filter is not symmetric, for correct convolution the filter must be flipped before element-wise multiplication
filters = autograd.Variable(torch.FloatTensor([[[[-1, 1]]]]))

#A test image of a square
inputs = autograd.Variable(torch.FloatTensor([[[[0,0,0,0,0,0,0], [0, 0, 1, 1, 1, 0, 0], 
                                             [0, 0, 1, 1, 1, 0, 0], [0, 0, 1, 1, 1, 0, 0],
                                            [0,0,0,0,0,0,0]]]]))
print(F.conv2d(inputs, filters))

This outputs

Variable containing:
(0 ,0 ,.,.) = 
  0  0  0  0  0  0
  0  1  0  0 -1  0
  0  1  0  0 -1  0
  0  1  0  0 -1  0
  0  0  0  0  0  0
[torch.FloatTensor of size 1x1x5x6]

This output is the result for cross-correlation. Therefore, we need to flip the filter

def flip_tensor(t):
    flipped = t.numpy().copy()

    for i in range(len(filters.size())):
        flipped = np.flip(flipped,i) #Reverse given tensor on dimention i
    return torch.from_numpy(flipped.copy())

print(F.conv2d(inputs, autograd.Variable(flip_tensor(filters.data))))

The new output is the correct result for convolution.

Variable containing:
(0 ,0 ,.,.) = 
  0  0  0  0  0  0
  0 -1  0  0  1  0
  0 -1  0  0  1  0
  0 -1  0  0  1  0
  0  0  0  0  0  0
[torch.FloatTensor of size 1x1x5x6]