I have a very simple question, is using sum
or matrix multiplication faster to sum a large vector? More precisely, here an example of problem I am trying to speed up:
d <- 1000
X <- matrix(rnorm(d^2), nrow = d)
y <- rnorm(d)
## Solution 1
sum(X%*%y)
## Solution 2
rep(1, d)%*%(X%*%y)
I have tried testing the two with system.time()
, but the times jump around each other and I can't get a fix on it. The times are very similar so this question has passed from practical to just inquisitive. Perhaps they are exactly the same time (seems unlikely).
Here's the function I've written to test it:
testSum <- function(d, its){
X <- matrix(rnorm(d^2), nrow=d)
y <- rnorm(d)
store <- matrix(NA, nrow = its, ncol = 3)
store2 <- matrix(NA, nrow = its, ncol = 3)
for(i in 1:its) store[i, ] <- system.time(sum(X%*%y))[1:3]
for(i in 1:its) store2[i, ] <- system.time(rep(1, d)%*%(X%*%y))[1:3]
return(list(sumF = mean(store[, 1]),
MM = mean(store2[, 1])))
}
testSum(1000, 100)
And the output always looks something like this:
$sumF
[1] 0.01021
$MM
[1] 0.01028
Where the top is using sum and the bottom is using matrix multiplication. Any hints, suggestions are welcome! Thanks!
One simple thing to try is using a larger vector.
library(microbenchmark)
A <- rnorm(1000000)
B <- rep(1, 1000000)
system.time(sum(A))
user system elapsed
0.012 0.000 0.01
system.time(B %*% A)
user system elapsed
0.044 0.000 0.04
microbenchmark(sum(A), B%*%A)
Unit: microseconds
expr min lq mean median uq max neval
sum(A) 899.735 953.361 1021.005 1021.415 1081.348 1885.857 100
B %*% A 2846.589 2946.579 3063.001 2993.341 3132.779 4498.773 100
library(microbenchmark)
A <- rnorm(10000000)
B <- rep(1, 10000000)
system.time(sum(A))
user system elapsed
0.012 0.000 0.011
system.time(B %*% A)
user system elapsed
0.044 0.000 0.044
microbenchmark(sum(A), B%*%A)
Unit: milliseconds
expr min lq mean median uq max neval
sum(A) 8.993579 9.294605 9.975156 9.729226 10.22477 14.29411 100
B %*% A 32.716323 33.818031 35.586381 35.966695 36.86165 41.13194 100
On my machine sum is ~4 times faster.
NOTE: I precomputed the vectors instead of multiplying vector and matrix to get a vector. Also precomputed the vector of ones to make the comparison more fair.