I'm building a PyTorch model to estimate Impuse Responses. Currently I am calculating the loss from the real and estimated impulse response. I would like to convolve both the estimated and real impulse response with a signal and then calculate the loss from those.
The pyroomaccoustics package uses SciPy's fftconvolve
to convolve the impulse response with a given signal. I cannot use this since it would break PyTorch's computation graph. PyTorch's conv1d
uses cross-correlation. From this answer it seems that by flipping the filter conv1d
can be used for convolution.
I am confused as to why the following code gives a different result for conv1d
and convolve
and what must be changed to get the outputs to be equal.
import torch
from scipy.signal import convolve
a = torch.tensor([.1, .2, .3, .4, .5])
b = torch.tensor([.0, .1, .0])
a1 = a.view(1, 1, -1)
b1 = torch.flip(b, (0,)).view(1, 1, -1)
print(torch.nn.functional.conv1d(a1, b1).view(-1))
# >>> tensor([0.0200, 0.0300, 0.0400])
print(convolve(a, b))
# >>> [0. 0.01 0.02 0.03 0.04 0.05 0. ]
Take a look at the mode
parameter of scipy.signal.convolve
. Use mode='valid'
to match PyTorch's conv1d
:
In [20]: from scipy.signal import convolve
In [21]: a = np.array([.1, .2, .3, .4, .5])
In [22]: b = np.array([.0, .1, .0])
In [23]: convolve(a, b, mode='valid')
Out[23]: array([0.02, 0.03, 0.04])
To modify the call of PyTorch's conv1d
to give the same output as the default behavior of scipy.signal.convolve
(i.e. to match mode='full'
) for this example, set padding=2
in the call to conv1d
. More generally, for a given convolution kernel b
, set the padding to len(b) - 1
.