Search code examples
luatorchtensor

Torch: delete tensor columns by indices


I would like to delete tensor columns by an array or tensor of indices. For example:

th> X = torch.rand(2,4)

th> X
 0.7475  0.2512  0.6085  0.6414
 0.7143  0.8299  0.2929  0.6945
[torch.DoubleTensor of size 2x4]

th> indices = torch.zeros(2)

th> indices[1] = 1

th> indices[2] = 3

th> indices
 1
 3
[torch.DoubleTensor of size 2]

th> X:delete(indices)
 0.2512  0.6414
 0.8299  0.6945

Solution

  • Strangely, there is no builtin function for that. It is not trivial operation, however. Torch tensors don't necessarily store their numbers in contiguous manner, but they absolutely must store it in a periodic manner, that is a stride must be constant along the dimension.

    If you need a tensor without a few columns or rows, simplest way is to use index:

     x=torch.Tensor{{1,2,3,4,},{5,6,7,8,}}
     y=x:index(2,torch.LongTensor{1,3,4})
     --return:
      1  3  4
      5  7  8
    

    This returns copy of the original tensor since there's no efficient way to keep track of all the elements that should be skipped.

    If you do not want to use additional memory, can get rid of a column using slices and views:

    x[{{},{2,3}}]=x[{{},{3,4}}]
    x=x:view(2,3)
    

    This involves moving around all the data behind the removed column. If you want to delete multiple of those, then optimizations might be implemented. This does not shrink the memory used by an array however. As far as I know, it is impossible to do reduce the memory usage without moving the needed data to a new storage.