Search code examples
pythonslicesparse-matrixpytorch

Column/row slicing a torch sparse tensor


I have a pytorch sparse tensor that I need sliced row/column wise using this slice [idx][:,idx] where idx is a list of indexes, using the mentioned slice yields my desired result on an ordinary float tensor. Is it possible applying the same slicing on a sparse tensor? Example here:

#constructing sparse matrix
i = np.array([[0,1,2,2],[0,1,2,1]])
v = np.ones(4)
i = torch.from_numpy(i.astype("int64"))
v = torch.from_numpy(v.astype("float32"))
test1 = torch.sparse.FloatTensor(i, v)

#constructing float tensor
test2 = np.array([[1,0,0],[0,1,0],[0,1,1]])
test2 = autograd.Variable(torch.cuda.FloatTensor(test2), requires_grad=False)

#slicing
idx = [1,2]
print(test2[idx][:,idx])

output:

Variable containing:
 1  0
 1  1
[torch.cuda.FloatTensor of size 2x2 (GPU 0)]

I am holding a 250.000 x 250.000 adjacency matrix, where I need to slice n rows and n columns, using the random idx, by simply sampling n random idx's. Since the dataset is so large it is not realistic to convert to a more convenient datatype.

can I achieve the same slicing result on test1? Is it even possible? If not, are there any work-arounds?

Right now I am running my model with the following "hack" of a solution:

idx = sorted(random.sample(range(0, np.shape(test1)[0]), 9000))
test1 = test1AsCsr[idx][:,idx].todense().astype("int32")
test1 = autograd.Variable(torch.cuda.FloatTensor(test1), requires_grad=False)

Where test1AsCsr is my test1 converted to a numpy CSR matrix. This solution works, it is however very slow, and makes my GPU utilization very low, since it needs to read/write from CPU memory, constantly.

Edit: Its fine with a non-sparse tensor as result


Solution

  • Well it's been a couple of years since there was activity on this question, but better late than never.

    This is the function I use for slicing sparse tensors. (Helper functions are below)

    def slice_torch_sparse_coo_tensor(t, slices):
        """
        params:
        -------
        t: tensor to slice
        slices: slice for each dimension
    
        returns:
        --------
        t[slices[0], slices[1], ..., slices[n]]
        """
    
        t = t.coalesce()
        assert len(args) == len(t.size())
        for i in range(len(args)):
            if type(args[i]) is not torch.Tensor:
                args[i] = torch.tensor(args[i], dtype= torch.long)
    
        indices = t.indices()
        values = t.values()
        for dim, slice in enumerate(args):
            invert = False
            if t.size(0) * 0.6 < len(slice):
                invert = True
                all_nodes = torch.arange(t.size(0))
                unique, counts = torch.cat([all_nodes, slice]).unique(return_counts=True)
                slice = unique[counts==1]
            if slice.size(0) > 400:
                mask = ainb_wrapper(indices[dim], slice)
            else:
                mask = ainb(indices[dim], slice)
            if invert:
                mask = ~mask
            indices = indices[:, mask]
            values = values[mask]
    
            
        return torch.sparse_coo_tensor(indices, values, t.size()).coalesce()
    

    Usage (took 2.4s on my machine):

    indices = torch.randint(low= 0, high= 200000, size= (2, 1000000))
    values = torch.rand(size=(1000000,))
    t = torch.sparse_coo_tensor(indices, values, size=(200000, 200000))
    idx = torch.arange(1000)
    slice_coo(t, [idx, idx])
    

    out:

    tensor(indices=tensor([[ 13,  62,  66,  78, 134, 226, 233, 266, 299, 344, 349,
                            349, 369, 396, 421, 531, 614, 619, 658, 687, 769, 792,
                            810, 840, 926, 979],
                           [255, 479, 305, 687, 672, 867, 444, 559, 772,  96, 788,
                            980, 423, 699, 911, 156, 267, 721, 381, 781,  97, 271,
                            840, 292, 487, 185]]),
           values=tensor([0.4260, 0.4816, 0.8001, 0.8815, 0.3971, 0.4914, 0.7068,
                          0.2329, 0.4038, 0.1757, 0.7758, 0.3210, 0.2593, 0.8290,
                          0.1320, 0.4322, 0.7529, 0.8341, 0.8128, 0.4457, 0.4100,
                          0.1618, 0.4097, 0.3088, 0.6942, 0.5620]),
           size=(200000, 200000), nnz=26, layout=torch.sparse_coo)
    

    Timings for slice_torch_sparse_coo_tensor:

    %timeit slice_torch_sparse_coo_tensor(t, torch.randperm(200000)[:500], torch.arange(200000))
    
    output:
        1.08 s ± 447 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
    

    for the built-in torch.index_select (implemented here):

    %timeit t.index_select(0, torch.arange(100))
    
    output:
        56.7 s ± 4.87 s per loop (mean ± std. dev. of 7 runs, 1 loop each)
    

    These are the helper functions I use for this purpose, the function "ainb" finds the elements of a that are in b. I found this function in the internet a while ago but I can't find the post to link it.

    import torch
    def ainb(a,b):
        """gets mask for elements of a in b"""
    
        size = (b.size(0), a.size(0))
    
        if size[0] == 0: # Prevents error in torch.Tensor.max(dim=0)
            return torch.tensor([False]*a.size(0), dtype= torch.bool)
            
        a = a.expand((size[0], size[1]))
        b = b.expand((size[1], size[0])).T
    
        mask = a.eq(b).max(dim= 0).values
    
        return mask
    
    def ainb_wrapper(a, b, splits = .72):
        inds = int(len(a)**splits)
    
        tmp = [ainb(a[i*inds:(i+1)*inds], b) for i in list(range(inds))]
    
        return torch.cat(tmp)
    

    Since the function scales quadratically with the amount of elements I added a wrapper that splits the input into chunks and then concatenates the output. It's more efficient using only CPU, but I am not sure whether this holds when using a GPU, would appreciate it if someone could test it :)

    It's my first time posting, so feedback on the quality of the post is also appreciated.