Search code examples
rfor-loopapplylapplysapply

Nested user defined functions using apply in R


How would the following be written using apply?

# Variables
age <- 1:100
Y   <- age+5
d   <- 0.25
dx  <- 5
a_x <- 1:dx
Yd  <- matrix( 0, nrow=max(age), ncol=dx )

# Nested loop is computationally inefficient?
for (a in age){
  for (ax in a_x){
    Yd[a,ax] <- (Y[[a]] * (1 - d) ** (ax-1))
  }
}

My model has a lot of these nested for loop structures, because I am incompetent. I am hoping to improve the computational time using apply. I find the apply functions rather confusing to get into. I am looking for a solution that illustrates how one can obtain such nested structures using apply. Hopefully, from there on I can apply (pun intended) the solution to even more complicated nested for loops (4-5 loops within each other).

For example

Ydi <- rep( list(), 6)

for (i in 1:6){
  Ydi[[i]] <- matrix( 0, nrow=max(age), ncol=dx )
}

# Nested loop is computationally inefficient?
for (i in 1:6){
  for (a in age){
    for (ax in a_x){
      Ydi[[i]][a,ax] <- (Y[[a]] * (1 - d) ** (ax-1)) + i
    }
  }
}

Solution

  • I would use expand.grid instead:

    df <- data.frame(expand.grid(a = age, ax = a_x))
    df[['Yd']] <- (df[['a']] + 5) * (1 - d) ** (df[['ax']] - 1)
    

    This is infinitely extendable (subject to memory constraints) - each additional nested loop will just be an additional variable in your expand.grid call. For example:

    new_col <- 1:2
    df_2 <- data.frame(expand.grid(a = age, ax = a_x, nc = new_col))
    df_2[['Yd']] <- (df_2[['a']] + 5) * (1 - d) ** (df_2[['ax']] - 1) + df_2[['nc']]
    

    This essentially switches to a tidy data format, which is an easier way of storing multi-dimensional data.

    For easier syntax, and faster speed, you can use the data.table package:

    library(data.table)
    dt_3 <- data.table(expand.grid(a = age, ax = a_x, nc = new_col))
    dt_3[ , Yd := (a + 5) * (1 - d) ** (ax - 1) + nc]