Automatically compute and store a lot of variables in a dataframe

I have a list called res.Sigma containing 100 covariance matrices of order 4*4, where each matrix corresponds to a fixed time. The values of the time variable are stored in a variable called X.

Now, we can compute the time-varying sample variances and store them in a dataframe as follows:

sample.var <- data.frame(X = X,
                         v11 = res.Sigma %>% lapply(function(val) val[1,1]) %>% unlist(),
                         v22 = res.Sigma %>% lapply(function(val) val[2,2]) %>% unlist(),
                         v33 = res.Sigma %>% lapply(function(val) val[3,3]) %>% unlist(),
                         v44 = res.Sigma %>% lapply(function(val) val[4,4]) %>% unlist())

Similarly, the time-varying sample correlations can be computed and stored as follows:

sample.corr <- data.frame(X = X,
                          r12 = res.Sigma %>% lapply(function(val) 
                                      val[1,2]/sqrt(val[1,1]*val[2,2])) %>% unlist(),
                          r13 = res.Sigma %>% lapply(function(val) 
                                      val[1,3]/sqrt(val[1,1]*val[3,3])) %>% unlist(),
                          r14 = res.Sigma %>% lapply(function(val) 
                                      val[1,4]/sqrt(val[1,1]*val[4,4])) %>% unlist(),
                          r23 = res.Sigma %>% lapply(function(val) 
                                      val[2,3]/sqrt(val[2,2]*val[3,3])) %>% unlist(),
                          .
                          .
                          .
                          )

There will be 4C2 = 6 columns for sample correlations between each pair of variables. However, if there were 20 variables instead of 4, there would be 20C2 = 190 such columns which will be impossible to write manually as I have written above.

I'm looking for a way to do this automatically, and name the columns appropriately (r_{i,j} for the column containing correlations between the i-th and the j-th variables).

Any help will be appreciated. Let me know if my question is unclear.

Solution

Here is a tidyverse example where the columns are generated dynamically. First, we define two functions which will generate the columns and then we apply them to the data frame.

library(tidyverse)

# create the mutate functions
create_var_col <- function(df, n, mat) {
    mutate(df, "v{n}{n}" := map(mat, function(x)
        x[n, n]))
}

create_corr_col <- function(df, j, k, mat) {
    mutate(df, "r{j}{k}" := map(mat, function(x)
        x[j, k] / (x[j, j] * x[k, k])))
}

Application:

# initalize res.Sigma and X
nrow <- 4

res.Sigma <- lapply(seq_len(100), function(x)
    matrix(runif(16), nrow = nrow))

X <- 1:100

# initialize output data frames
sample.var <- data.frame(
    X = X
)

sample.corr <- data.frame(
    X = X
)

# populate the data frames with new columns
for (n in 1:nrow) {
    sample.var <- create_var_col(sample.var, n, res.Sigma)
}

combs <- combn(nrow, 2)

for (i in (seq_len(ncol(combs)))) {

    j <- combs[1, i]
    k <- combs[2, i]
    
    sample.corr <- create_corr_col(sample.corr, j, k, res.Sigma)
}

Results:

head(sample.var)

> head(sample.var)
  X       v11       v22          v33        v44
1 1 0.3770457 0.1360691 0.8865960506 0.22080969
2 2 0.7450246 0.6379828 0.0007324344 0.15418729
3 3 0.4506430 0.5700204 0.7727197134 0.76241635
4 4 0.3577171 0.7811212 0.2732931704 0.12555681
5 5 0.9532550 0.9781340 0.2275580675 0.03497771
6 6 0.0226325 0.3912557 0.4199170584 0.26492276

> head(sample.corr)
  X         r12        r13        r14         r23        r24          r34
1 1  17.6002993   1.923644  0.8184414   2.3721132 21.9337539    0.6569090
2 2   0.6498641 817.358339  4.9511748 299.1091543  0.6651190 4276.9960394
3 3   0.6825716   2.117533  0.8900186   1.9256916  1.9253819    0.6588873
4 4   0.8280786   8.324110  1.6634117   1.7292685  2.0207334    9.9122315
5 5   0.9520840   3.642721  3.5012283   2.2722650  0.2582217  116.3901019
6 6 100.8714017  99.157062 97.8675581   0.5384571  5.6053657    2.5930488