Search code examples
rmatrixsparse-matrixnormalization

How can I normalize a sparse matrix in R by both rows and columns?


I have created a sparse matrix using the R package "Matrix". The matrix is not square, and its dimensions are 4561 by 68825.

I'm looking to standardize this matrix so that each value x is equal to x / row sum + column sum. I've found a solution on stack which I could alter to solve this problem here. However, in the solution seen in the linked question, the problem uses a square matrix, so Diaganal can be used.In my case, my matrix is not square so I can't make this solution work.

How can I normalize a sparse matrix in R by both rows and columns?


Solution

  • Hope this helps!

    m_final <- t(t(m/rowSums(m)) + rowSums(t(m)))
    m_final
    

    Output is:

               [,1]     [,2]       [,3]
     [1,] 0.9748283 3.326324 -0.8274075
     [2,] 1.4574957 2.776025 -0.7597753
     [3,] 1.9265464 2.937874 -1.3906749
     [4,] 0.7105211 3.337394 -0.5741696
     [5,] 1.4808831 3.030777 -1.0379153
     [6,] 2.2123599 2.537209 -1.2758243
     [7,] 2.8672471 2.437124 -1.8306263
     [8,] 4.8144351 6.952963 -8.2936531
     [9,] 1.9486587 3.382196 -1.8571098
    [10,] 0.8897446 3.329129 -0.7451281
    


    #sample data:
    set.seed(1)
    m <- replicate(3,rnorm(10))
    > m
                [,1]        [,2]        [,3]
     [1,] -0.6264538  1.51178117  0.91897737
     [2,]  0.1836433  0.38984324  0.78213630
     [3,] -0.8356286 -0.62124058  0.07456498
     [4,]  1.5952808 -2.21469989 -1.98935170
     [5,]  0.3295078  1.12493092  0.61982575
     [6,] -0.8204684 -0.04493361 -0.05612874
     [7,]  0.4874291 -0.01619026 -0.15579551
     [8,]  0.7383247  0.94383621 -1.47075238
     [9,]  0.5757814  0.82122120 -0.47815006
    [10,] -0.3053884  0.59390132  0.41794156
    

    Edit:
    In case you want to have below calculation then you can try

    m/(row_sum + col_sum)

    m/outer(rowSums(m), colSums(m), FUN = "+")