I have a few PyTorch examples that confuse me and am hoping to clear it up.
Firstly, as per the PyTorch page, I would expect these example to work as do their numpy equivalents namely these. The first example is very intuitive. These are compatible for broadcasting:
Image (3d array): 256 x 256 x 3
Scale (1d array): 3
Result (3d array): 256 x 256 x 3
Take for instance these:
torch.Tensor([[1,2,3]])/torch.Tensor([1,2,3])
Out[5]:
1 1 1
[torch.FloatTensor of size 1x3]
torch.Tensor([[1,2,3]])/torch.Tensor([1,2,3])
Out[6]:
1 1 1
[torch.FloatTensor of size 1x3]
torch.Tensor([[[1,2,3]]])/torch.Tensor([1,2,3])
Out[7]:
(0 ,.,.) =
1 1 1
[torch.FloatTensor of size 1x1x3]
However, this is the result of the numpy example:
torch.randn(256,256,3)/torch.Tensor([1,2,3])
Traceback (most recent call last):
File "<ipython-input-12-4c328d453e24>", line 1, in <module>
torch.randn(256,256,3)/torch.Tensor([1,2,3])
File "/home/nick/anaconda3/lib/python3.6/site-packages/torch/tensor.py", line 313, in __div__
return self.div(other)
RuntimeError: inconsistent tensor size at /opt/conda/conda-bld/pytorch_1501971235237/work/pytorch-0.1.12/torch/lib/TH/generic/THTensorMath.c:873
Here is an excerpt which says this ought to work:
Two tensors are “broadcastable” if the following rules hold:
- Each tensor has at least one dimension.
- When iterating over the dimension sizes, starting at the trailing dimension, the dimension sizes must either be equal, one of them is 1, or one of them does not exist.
If one converts the tensors to numpy arrays the arithmetic works as expected.
What's going wrong? Have I misinterpreted the docs and if so what syntax results in the same results?
Verify that you are using the correct version of pytorch. It must be 0.2.0, which is when broadcasting was introduced in pytorch.
In[2]: import torch
In[3]: torch.__version__
Out[3]: '0.2.0_4'