r dataframe optimization statistics memory-efficient

How to run Wilcox test on all combination of groups across large number of features in R?

I have a large sparse matrix (lets call it matrix) in which the rows are the features and the columns are the samples. Each column/sample belong to 1 of 6 groups. I randomly sample some amount from each group and store what index they belong to in the original matrix.

astro_index <- Map(sample,row_index, num_sample)[1] 
endo_index <- Map(sample,row_index, num_sample)[2] 
micro_index <- Map(sample,row_index, num_sample)[3]
neuron_index <- Map(sample,row_index, num_sample)[4] 
oligo_index <- Map(sample,row_index, num_sample)[5] 
opc_index <- Map(sample,row_index, num_sample)[6]

The goal is to be able perform a Wilcox test and get the p-value on all the combination of the 6 groups for each features. The big issue is that I have over 30,000 features to test on all combination of 6 groups (so that is 15 comparisons for each of the 30,000+ features).

So I have two current methods. The first one uses the apply function and does it for only one comparison (here astro and neuron group). The disadvantage for this method is that I run into memory issues and it only does 1 comparison at a time. I would have to write this 14 more times to get all possible comparisons.

store_p <- apply(matrix,1,function(x) {wilcox.test(x[astro_index],x[neuron_index])$p.value })

The second method uses a for loop to go through all the features but I take advantage of the combn and the data frame to calculate the p-value for all combinations but one feature at a time. This method is really slow but does not crash.

for (i in features){

  df <- data.frame('Astro' = matrix[i,astro_index], 'Endo' = matrix[i,endo_index], 'Micro' = matrix[i,micro_index], 'Neuron' = matrix[i,neuron_index], 'Oligo' = matrix[i,oligo_index], 'OPC' = matrix[i,opc_index])
  result <- combn(names(df), 2, FUN = function(x) paste(paste(x, collapse='-'), wilcox.test(df[,x[1]], df[,x[2]], paired = TRUE)$p.value, sep=" : ")) 
  hold_list <- append(hold_list, list(result))

}

To give a sense of what the result looks like. Here is a sample output of result

> result
 [1] "Astro-Endo : 0.115331575924872"      "Astro-Micro : 0.935664046257304"     "Astro-Neuron : 0.0271849565394441"  
 [4] "Astro-Oligo : 0.00147694402781699"   "Astro-OPC : 0.0476580762532988"      "Endo-Micro : 0.297672151508384"     
 [7] "Endo-Neuron : 2.38134038927696e-06"  "Endo-Oligo : 0.0323129112432441"     "Endo-OPC : 0.451258974150342"       
[10] "Micro-Neuron : 0.000143621746738224" "Micro-Oligo : 0.0178171887595787"    "Micro-OPC : 0.0692129715131915"     
[13] "Neuron-Oligo : 6.68255453156116e-10" "Neuron-OPC : 6.201108273594e-07"     "Oligo-OPC : 0.142213241936393"

I would ideally like the best of both world of both methods and do a more efficient process to compute these tests. Also if there is a suggestion to designing a different data frame all together to tackle this task in one way I would appreciate that too.

EDIT I realized I did not make at as clear but the result is only for one feature of all combinations. I have a for loop so that it goes through all the features. In essence, there should be a p-value calculated for all the feature and for all the combination.

Solution

I would use pairwiseWilcox from scran for that - that seems ideally suited for your problem. It performs pairwise Wilcoxon rank sum tests for each row between groups of columns, where groups is a vector of column assignments.

Edit:

Sampled groups to have equal number of elements (columns), since the OP seems to want that.
Made the matrix less sparse to make it clearer that it does not compare individual values, but groups of values for each row.

Example:

library(Matrix)
types <- c("Astro", "Neuron", "Endo", "Oligo", "OPC", "Micro")

# generate sparse matrix
set.seed(123)
mat <- Matrix(0, nrow = 10000, ncol = 1000, sparse = TRUE)
mat[sample(seq_along(mat), 1E5)] <- runif(n = 1e5, min = 0, max=100)
groups <- c(rep(types, each = floor(ncol(mat)/6)), rep("Micro", ncol(mat) %% 6))
colnames(mat) <- make.unique(groups)

# sample n=100 samples of each group
idx <- setNames(lapply(types, function(x) grep(x, colnames(mat))), types)
smp <- Map(sample, idx, size = 100)
groups <- gsub("[0-9]+", "", names(unlist(smp)))

# subset mat to sampled columns
mat <- mat[, unlist(smp, use.names = FALSE)]

library(scran)

pwt <- pairwiseWilcox(mat, groups = groups)
pwt
#> $statistics
#> $statistics[[1]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.49995  1.000000  1.000000
#> 2       0.51000  0.158341  0.616668
#> 3       0.49000  0.158341  0.616668
#> 4       0.50490  0.573540  0.856541
#> 5       0.48985  0.308851  0.616668
#> ...         ...       ...       ...
#> 9996     0.4950  0.565662  0.856541
#> 9997     0.5050  0.322174  0.616668
#> 9998     0.4951  0.573540  0.856541
#> 9999     0.4950  0.322174  0.616668
#> 10000    0.5050  0.322174  0.616668
#> 
#> $statistics[[2]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1        0.5050 0.3221741  0.613045
#> 2        0.5049 0.5735395  0.858464
#> 3        0.4800 0.0444225  0.613045
#> 4        0.4947 0.6352736  0.948311
#> 5        0.4949 0.5578376  0.858464
#> ...         ...       ...       ...
#> 9996    0.49500  0.565662  0.858464
#> 9997    0.50005  1.000000  1.000000
#> 9998    0.50500  0.322174  0.613045
#> 9999    0.50000  1.000000  1.000000
#> 10000   0.50500  0.322174  0.613045
#> 
#> $statistics[[3]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1        0.5050  0.322174  0.605697
#> 2        0.5001  0.995980  1.000000
#> 3        0.5000  1.000000  1.000000
#> 4        0.5100  0.158341  0.605697
#> 5        0.4949  0.557838  0.854499
#> ...         ...       ...       ...
#> 9996    0.50005  1.000000  1.000000
#> 9997    0.49995  1.000000  1.000000
#> 9998    0.50005  1.000000  1.000000
#> 9999    0.49500  0.322174  0.605697
#> 10000   0.49995  1.000000  1.000000
#> 
#> $statistics[[4]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.49995  1.000000  1.000000
#> 2       0.50010  0.995980  1.000000
#> 3       0.50000  1.000000  1.000000
#> 4       0.49490  0.648212  0.959177
#> 5       0.50500  0.322174  0.615026
#> ...         ...       ...       ...
#> 9996     0.4949  0.557838  0.859750
#> 9997     0.4951  0.573540  0.859750
#> 9998     0.4852  0.182661  0.615026
#> 9999     0.5000  1.000000  1.000000
#> 10000    0.4949  0.557838  0.859750
#> 
#> $statistics[[5]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.50500  0.322174  0.620334
#> 2       0.49480  0.641729  0.964426
#> 3       0.50000  1.000000  1.000000
#> 4       0.51000  0.158341  0.620334
#> 5       0.49995  1.000000  1.000000
#> ...         ...       ...       ...
#> 9996    0.50005  1.000000  1.000000
#> 9997    0.49015  0.323442  0.620334
#> 9998    0.50005  1.000000  1.000000
#> 9999    0.49500  0.322174  0.620334
#> 10000   0.50500  0.322174  0.620334
#> 
#> $statistics[[6]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.50005  1.000000  1.000000
#> 2       0.49000  0.158341  0.616668
#> 3       0.51000  0.158341  0.616668
#> 4       0.49510  0.573540  0.856541
#> 5       0.51015  0.308851  0.616668
#> ...         ...       ...       ...
#> 9996     0.5050  0.565662  0.856541
#> 9997     0.4950  0.322174  0.616668
#> 9998     0.5049  0.573540  0.856541
#> 9999     0.5050  0.322174  0.616668
#> 10000    0.4950  0.322174  0.616668
#> 
#> $statistics[[7]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.50500  0.322174  0.616668
#> 2       0.49500  0.322174  0.616668
#> 3       0.48960  0.392127  0.746909
#> 4       0.49005  0.318530  0.616668
#> 5       0.50500  0.654721  0.960283
#> ...         ...       ...       ...
#> 9996     0.5001  0.995980  1.000000
#> 9997     0.4950  0.322174  0.616668
#> 9998     0.5100  0.158341  0.616668
#> 9999     0.5050  0.322174  0.616668
#> 10000    0.5000  1.000000  1.000000
#> 
#> $statistics[[8]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1         0.505  0.322174  0.604226
#> 2         0.490  0.158341  0.604226
#> 3         0.510  0.158341  0.604226
#> 4         0.505  0.322174  0.604226
#> 5         0.505  0.654721  0.952044
#> ...         ...       ...       ...
#> 9996    0.50510  0.557838  0.849437
#> 9997    0.49500  0.322174  0.604226
#> 9998    0.50500  0.565662  0.849437
#> 9999    0.49995  1.000000  1.000000
#> 10000   0.49500  0.322174  0.604226
#> 
#> $statistics[[9]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.49995  1.000000  1.000000
#> 2       0.49000  0.158341  0.611076
#> 3       0.51000  0.158341  0.611076
#> 4       0.49005  0.318530  0.611076
#> 5       0.51500  0.082748  0.611076
#> ...         ...       ...       ...
#> 9996     0.5000  1.000000  1.000000
#> 9997     0.4900  0.158341  0.611076
#> 9998     0.4899  0.405995  0.762863
#> 9999     0.5050  0.322174  0.611076
#> 10000    0.4900  0.158341  0.611076
#> 
#> $statistics[[10]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.50500  0.322174  0.619147
#> 2       0.48500  0.082748  0.619147
#> 3       0.51000  0.158341  0.619147
#> 4       0.50500  0.322174  0.619147
#> 5       0.50985  0.323442  0.619147
#> ...         ...       ...       ...
#> 9996    0.50500  0.565662  0.863244
#> 9997    0.48500  0.082748  0.619147
#> 9998    0.50510  0.557838  0.863244
#> 9999    0.50005  1.000000  1.000000
#> 10000   0.50000  1.000000  1.000000
#> 
#> $statistics[[11]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1        0.4950 0.3221741  0.613045
#> 2        0.4951 0.5735395  0.858464
#> 3        0.5200 0.0444225  0.613045
#> 4        0.5053 0.6352736  0.948311
#> 5        0.5051 0.5578376  0.858464
#> ...         ...       ...       ...
#> 9996    0.50500  0.565662  0.858464
#> 9997    0.49995  1.000000  1.000000
#> 9998    0.49500  0.322174  0.613045
#> 9999    0.50000  1.000000  1.000000
#> 10000   0.49500  0.322174  0.613045
#> 
#> $statistics[[12]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.49500  0.322174  0.616668
#> 2       0.50500  0.322174  0.616668
#> 3       0.51040  0.392127  0.746909
#> 4       0.50995  0.318530  0.616668
#> 5       0.49500  0.654721  0.960283
#> ...         ...       ...       ...
#> 9996     0.4999  0.995980  1.000000
#> 9997     0.5050  0.322174  0.616668
#> 9998     0.4900  0.158341  0.616668
#> 9999     0.4950  0.322174  0.616668
#> 10000    0.5000  1.000000  1.000000
#> 
#> $statistics[[13]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1        0.5000 1.0000000  1.000000
#> 2        0.4951 0.5735395  0.868341
#> 3        0.5200 0.0444225  0.625009
#> 4        0.5150 0.0827480  0.625009
#> 5        0.5000 1.0000000  1.000000
#> ...         ...       ...       ...
#> 9996    0.50500  0.565662  0.868341
#> 9997    0.49995  1.000000  1.000000
#> 9998    0.49500  0.322174  0.625009
#> 9999    0.49500  0.322174  0.625009
#> 10000   0.49500  0.322174  0.625009
#> 
#> $statistics[[14]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.49500 0.3221741  0.606038
#> 2       0.49510 0.5735395  0.852213
#> 3       0.52000 0.0444225  0.606038
#> 4       0.50005 1.0000000  1.000000
#> 5       0.51000 0.1583409  0.606038
#> ...         ...       ...       ...
#> 9996     0.4998 0.9879417  1.000000
#> 9997     0.4951 0.5735395  0.852213
#> 9998     0.4800 0.0444225  0.606038
#> 9999     0.5000 1.0000000  1.000000
#> 10000    0.4900 0.1583409  0.606038
#> 
#> $statistics[[15]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.50000 1.0000000  1.000000
#> 2       0.49005 0.3185296  0.619978
#> 3       0.52000 0.0444225  0.619978
#> 4       0.51500 0.0827480  0.619978
#> 5       0.50500 0.5656624  0.863114
#> ...         ...       ...       ...
#> 9996    0.50500  0.565662  0.863114
#> 9997    0.49015  0.323442  0.619978
#> 9998    0.49500  0.322174  0.619978
#> 9999    0.49500  0.322174  0.619978
#> 10000   0.50000  1.000000  1.000000
#> 
#> $statistics[[16]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1        0.4950  0.322174  0.605697
#> 2        0.4999  0.995980  1.000000
#> 3        0.5000  1.000000  1.000000
#> 4        0.4900  0.158341  0.605697
#> 5        0.5051  0.557838  0.854499
#> ...         ...       ...       ...
#> 9996    0.49995  1.000000  1.000000
#> 9997    0.50005  1.000000  1.000000
#> 9998    0.49995  1.000000  1.000000
#> 9999    0.50500  0.322174  0.605697
#> 10000   0.50005  1.000000  1.000000
#> 
#> $statistics[[17]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1         0.495  0.322174  0.604226
#> 2         0.510  0.158341  0.604226
#> 3         0.490  0.158341  0.604226
#> 4         0.495  0.322174  0.604226
#> 5         0.495  0.654721  0.952044
#> ...         ...       ...       ...
#> 9996    0.49490  0.557838  0.849437
#> 9997    0.50500  0.322174  0.604226
#> 9998    0.49500  0.565662  0.849437
#> 9999    0.50005  1.000000  1.000000
#> 10000   0.50500  0.322174  0.604226
#> 
#> $statistics[[18]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1        0.5000 1.0000000  1.000000
#> 2        0.5049 0.5735395  0.868341
#> 3        0.4800 0.0444225  0.625009
#> 4        0.4850 0.0827480  0.625009
#> 5        0.5000 1.0000000  1.000000
#> ...         ...       ...       ...
#> 9996    0.49500  0.565662  0.868341
#> 9997    0.50005  1.000000  1.000000
#> 9998    0.50500  0.322174  0.625009
#> 9999    0.50500  0.322174  0.625009
#> 10000   0.50500  0.322174  0.625009
#> 
#> $statistics[[19]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1        0.4950  0.322174  0.619978
#> 2        0.4999  0.995980  1.000000
#> 3        0.5000  1.000000  1.000000
#> 4        0.4850  0.082748  0.619978
#> 5        0.5100  0.158341  0.619978
#> ...         ...       ...       ...
#> 9996     0.4949  0.557838  0.863504
#> 9997     0.4951  0.573540  0.863504
#> 9998     0.4850  0.176800  0.619978
#> 9999     0.5050  0.322174  0.619978
#> 10000    0.4949  0.557838  0.863504
#> 
#> $statistics[[20]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1        0.5000  1.000000  1.000000
#> 2        0.4947  0.635274  0.958759
#> 3        0.5000  1.000000  1.000000
#> 4        0.5000  1.000000  1.000000
#> 5        0.5050  0.565662  0.869131
#> ...         ...       ...       ...
#> 9996    0.49995  1.000000  1.000000
#> 9997    0.49015  0.323442  0.625372
#> 9998    0.50005  1.000000  1.000000
#> 9999    0.50005  1.000000  1.000000
#> 10000   0.50500  0.322174  0.625372
#> 
#> $statistics[[21]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.50005  1.000000  1.000000
#> 2       0.49990  0.995980  1.000000
#> 3       0.50000  1.000000  1.000000
#> 4       0.50510  0.648212  0.959177
#> 5       0.49500  0.322174  0.615026
#> ...         ...       ...       ...
#> 9996     0.5051  0.557838  0.859750
#> 9997     0.5049  0.573540  0.859750
#> 9998     0.5148  0.182661  0.615026
#> 9999     0.5000  1.000000  1.000000
#> 10000    0.5051  0.557838  0.859750
#> 
#> $statistics[[22]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.50005  1.000000  1.000000
#> 2       0.51000  0.158341  0.611076
#> 3       0.49000  0.158341  0.611076
#> 4       0.50995  0.318530  0.611076
#> 5       0.48500  0.082748  0.611076
#> ...         ...       ...       ...
#> 9996     0.5000  1.000000  1.000000
#> 9997     0.5100  0.158341  0.611076
#> 9998     0.5101  0.405995  0.762863
#> 9999     0.4950  0.322174  0.611076
#> 10000    0.5100  0.158341  0.611076
#> 
#> $statistics[[23]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.50500 0.3221741  0.606038
#> 2       0.50490 0.5735395  0.852213
#> 3       0.48000 0.0444225  0.606038
#> 4       0.49995 1.0000000  1.000000
#> 5       0.49000 0.1583409  0.606038
#> ...         ...       ...       ...
#> 9996     0.5002 0.9879417  1.000000
#> 9997     0.5049 0.5735395  0.852213
#> 9998     0.5200 0.0444225  0.606038
#> 9999     0.5000 1.0000000  1.000000
#> 10000    0.5100 0.1583409  0.606038
#> 
#> $statistics[[24]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1        0.5050  0.322174  0.619978
#> 2        0.5001  0.995980  1.000000
#> 3        0.5000  1.000000  1.000000
#> 4        0.5150  0.082748  0.619978
#> 5        0.4900  0.158341  0.619978
#> ...         ...       ...       ...
#> 9996     0.5051  0.557838  0.863504
#> 9997     0.5049  0.573540  0.863504
#> 9998     0.5150  0.176800  0.619978
#> 9999     0.4950  0.322174  0.619978
#> 10000    0.5051  0.557838  0.863504
#> 
#> $statistics[[25]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1        0.5050  0.322174  0.618555
#> 2        0.4947  0.635274  0.954724
#> 3        0.5000  1.000000  1.000000
#> 4        0.5150  0.082748  0.618555
#> 5        0.4950  0.322174  0.618555
#> ...         ...       ...       ...
#> 9996     0.5051  0.557838  0.864937
#> 9997     0.4948  0.641729  0.961248
#> 9998     0.5152  0.171079  0.618555
#> 9999     0.4950  0.322174  0.618555
#> 10000    0.5100  0.158341  0.618555
#> 
#> $statistics[[26]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.49500  0.322174  0.620334
#> 2       0.50520  0.641729  0.964426
#> 3       0.50000  1.000000  1.000000
#> 4       0.49000  0.158341  0.620334
#> 5       0.50005  1.000000  1.000000
#> ...         ...       ...       ...
#> 9996    0.49995  1.000000  1.000000
#> 9997    0.50985  0.323442  0.620334
#> 9998    0.49995  1.000000  1.000000
#> 9999    0.50500  0.322174  0.620334
#> 10000   0.49500  0.322174  0.620334
#> 
#> $statistics[[27]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.49500  0.322174  0.619147
#> 2       0.51500  0.082748  0.619147
#> 3       0.49000  0.158341  0.619147
#> 4       0.49500  0.322174  0.619147
#> 5       0.49015  0.323442  0.619147
#> ...         ...       ...       ...
#> 9996    0.49500  0.565662  0.863244
#> 9997    0.51500  0.082748  0.619147
#> 9998    0.49490  0.557838  0.863244
#> 9999    0.49995  1.000000  1.000000
#> 10000   0.50000  1.000000  1.000000
#> 
#> $statistics[[28]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1       0.50000 1.0000000  1.000000
#> 2       0.50995 0.3185296  0.619978
#> 3       0.48000 0.0444225  0.619978
#> 4       0.48500 0.0827480  0.619978
#> 5       0.49500 0.5656624  0.863114
#> ...         ...       ...       ...
#> 9996    0.49500  0.565662  0.863114
#> 9997    0.50985  0.323442  0.619978
#> 9998    0.50500  0.322174  0.619978
#> 9999    0.50500  0.322174  0.619978
#> 10000   0.50000  1.000000  1.000000
#> 
#> $statistics[[29]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1        0.5000  1.000000  1.000000
#> 2        0.5053  0.635274  0.958759
#> 3        0.5000  1.000000  1.000000
#> 4        0.5000  1.000000  1.000000
#> 5        0.4950  0.565662  0.869131
#> ...         ...       ...       ...
#> 9996    0.50005  1.000000  1.000000
#> 9997    0.50985  0.323442  0.625372
#> 9998    0.49995  1.000000  1.000000
#> 9999    0.49995  1.000000  1.000000
#> 10000   0.49500  0.322174  0.625372
#> 
#> $statistics[[30]]
#> DataFrame with 10000 rows and 3 columns
#>             AUC   p.value       FDR
#>       <numeric> <numeric> <numeric>
#> 1        0.4950  0.322174  0.618555
#> 2        0.5053  0.635274  0.954724
#> 3        0.5000  1.000000  1.000000
#> 4        0.4850  0.082748  0.618555
#> 5        0.5050  0.322174  0.618555
#> ...         ...       ...       ...
#> 9996     0.4949  0.557838  0.864937
#> 9997     0.5052  0.641729  0.961248
#> 9998     0.4848  0.171079  0.618555
#> 9999     0.5050  0.322174  0.618555
#> 10000    0.4900  0.158341  0.618555
#> 
#> 
#> $pairs
#> DataFrame with 30 rows and 2 columns
#>           first      second
#>     <character> <character>
#> 1         Astro        Endo
#> 2         Astro       Micro
#> 3         Astro      Neuron
#> 4         Astro       Oligo
#> 5         Astro         OPC
#> ...         ...         ...
#> 26          OPC       Astro
#> 27          OPC        Endo
#> 28          OPC       Micro
#> 29          OPC      Neuron
#> 30          OPC       Oligo

^{Created on 2020-06-18 by the reprex package (v0.3.0)}

Edit #2:

Just to see what my approach would give you in about 1/20,000th of time (on my machine, at least) of the approach by StupidWolf, try this with his example mat and group:

set.seed(111)
celltypes = c("astro","endo","micro","neuron","oligo","opc")
mat = matrix(rnorm(10000*120),ncol=120)
colnames(mat) = paste0("cell",1:120)
rownames(mat) = paste0("gene",1:10000)
metadata = data.frame(celltype=rep(celltypes,each=20))
num_sample = 10
use_cols = tapply(1:nrow(metadata),metadata$celltype,sample,num_sample)
use_cols = unlist(use_cols)
group = metadata$celltype[use_cols]

library(scran)
library(data.table)
pwt <- pairwiseWilcox(mat[,use_cols], groups=group)
unique_comps <- !duplicated(t(apply(pwt$pairs, 1, sort)))
res <- rbindlist(setNames(lapply(pwt$statistics[unique_comps], 
                                 function(x) as.data.table(x, keep.rownames=TRUE)), 
                          apply(pwt$pairs[unique_comps,], 1, paste, collapse = '_')),
                 idcol = "comparison")[, .(comparison, rn, p.value)]

setnames(res, "rn", "gene")
res[gene=="gene999"]
#>       comparison    gene   p.value
#>  1:   astro_endo gene999 0.1858767
#>  2:  astro_micro gene999 0.5707504
#>  3: astro_neuron gene999 0.3846731
#>  4:  astro_oligo gene999 0.4273553
#>  5:    astro_opc gene999 0.3846731
#>  6:   endo_micro gene999 0.4726756
#>  7:  endo_neuron gene999 0.9097219
#>  8:   endo_oligo gene999 0.6231762
#>  9:     endo_opc gene999 0.6775850
#> 10: micro_neuron gene999 0.9097219
#> 11:  micro_oligo gene999 0.9097219
#> 12:    micro_opc gene999 0.9097219
#> 13: neuron_oligo gene999 0.8501067
#> 14:   neuron_opc gene999 0.6775850
#> 15:    oligo_opc gene999 0.9698500

^{Created on 2020-06-19 by the reprex package (v0.3.0)}