Search code examples
pythonnumpypytorchnumpy-ndarray

Numpy indexing with ndarray and PyTorch tensor


I find numpy array indexing works differently with ndarrray and PyTorch tensor of shape (1,) and want to know why. Please see the case below:

import numpy as np
import torch as th

x = np.arange(10)

y = x[np.array([1])]
z = x[th.tensor([1])]
print(y, z)

y would be array[2] while z being just 2. What is the difference exactly?


Solution

  • Note that integer tensors of a single element can be converted to an index:

    >>> torch.tensor([1]).__index__()
    1
    >>> torch.tensor([1, 2]).__index__()
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: only integer tensors of a single element can be converted to an index
    

    When the index passed in is tensor, ndarray cannot recognize it, so it attempts to call its __index__ method. If the conversion is successful, it is treated as an integer:

    if (PyLong_CheckExact(obj) || !PyArray_Check(obj)) {
        // it calls PyNumber_Index() internally
        npy_intp ind = PyArray_PyIntAsIntp(obj);
    
        if (error_converting(ind)) {
            PyErr_Clear();
        }
        else {
            index_type |= HAS_INTEGER;
            indices[curr_idx].object = NULL;
            indices[curr_idx].value = ind;
            indices[curr_idx].type = HAS_INTEGER;
            used_ndim += 1;
            new_ndim += 0;
            curr_idx += 1;
            continue;
        }
    }
    

    Source code from NumPy.

    Therefore, the effect of code of OP is equivalent to the following code:

    >>> np.arange(10)[1]
    1