Reset the dimension of all the matrixs in a list

Data: assume that I have a list of matrixs called S, which could be generated by:

S<-list(c(1:25),c(1:25),c(1:25),c(1:25))

Here is a feasible way which I want to optimize:

for (i in 1:length(S))
{
  dim(S[[i]])<-c(5,5)
}

After searching on the net, I tried to use lapply to apply a function over a list, and this is the code I have tried:

mat<-lapply(S, function(x) dim(x)<-c(5,5))

which only returns:

> mat
[[1]]
[1] 5 5

[[2]]
[1] 5 5

[[3]]
[1] 5 5

[[4]]
[1] 5 5

Question: I'm wondering if there is an inbuilt function which could apply a function over a list which doesn't require return, or there are some errors in my code?

Thanks in advance.

Solution

Extending your code attempt, you need to include an explicit or implicit return statement:

lapply(S, function(x) { dim(x) <- c(5, 5); return(x) })
lapply(S, function(x) { dim(x) <- c(5, 5); x; })

or faster by recasting every list entry as a matrix:

lapply(S, function(x) matrix(x, 5, 5))

or using purrr::map:

map(S, ~ matrix(., 5, 5))

Benchmark comparison

[Edited by @HunterJiang]

library(microbenchmark)
library(purrr)
library(ggplot2)
N<-30
M<-30
S<-list(c(1:(N*M)),c(1:(N*M)),c(1:(N*M)),c(1:(N*M)))
mb <- microbenchmark(
  for_loop = { for (i in 1:length(S)) dim(S[[i]])<-c(N,M) },
  dim_plus_return = { S1<-lapply(S, function(x) { dim(x) <- c(N,M); return(x) }) },
  cast_matrix = { S1<-lapply(S, function(x) matrix(x, N,M)) },
  purrr_map = { S1<-map(S, ~ matrix(.,N,M)) },
  set_dim_directly = { S1<-lapply(S, `dim<-`, c(N,M)) }
)
mb
ggplot(mb, aes(expr, log10(time))) + 
  geom_boxplot() + 
  labs(y = "Time in log10 nanosec", x = "Method")

When N and M are small, says N=M=30, the speed of methods are:

Unit: microseconds
             expr      min       lq       mean    median        uq      max neval
         for_loop 2111.950 2236.298 2537.42270 2328.4735 2484.2055 4581.549   100
  dim_plus_return   10.264   12.633   32.91945   16.1855   19.3440 1641.794   100
      cast_matrix   11.054   13.423   27.40873   16.3830   18.9490 1068.213   100
        purrr_map   70.662   77.768   99.41636   93.1640  112.9015  199.748   100
 set_dim_directly    5.527    6.909    8.47230    7.8960    9.6720   22.502   100

But when N and M become larger, says N=M=3k, lapply gets slower than it used to be and for loop might be a proper way to do this.

Unit: milliseconds
             expr       min       lq     mean   median        uq      max neval
         for_loop  2.224456 20.83191 52.76189 41.72521  69.91993 180.9775   100
  dim_plus_return 35.930768 37.57671 68.63905 39.31620  74.14185 193.8300   100
      cast_matrix 48.220338 51.16917 79.73308 52.37871  87.31804 199.2859   100
        purrr_map 49.534089 51.21635 89.11881 61.12987 101.98780 195.1374   100
 set_dim_directly 35.151124 37.71112 67.72032 39.91919  74.97617 184.4943   100

Conclusion: S1<-lapply(S, `dim<-`, c(N,M)) fits the small dataset, and for loop could be faster when the dimension of the dataset is very large.