Search code examples
rlistdataframefunctionmapply

How to apply a function to every element in a dataframe in a list and return a dataframe in R?


I have a dataframe which looks like this example (just much larger):

Name <- c('Peter','Peter','Peter', 'Ben','Ben','Ben','Mary', 'Mary', 'Mary')
var1 <- c(0.4, 0.6, 0.7, 0.3, 0.9, 0.2, 0.4, 0.6, 0.7)
var2 <- c(0.5, 0.4, 0.2, 0.5, 0.4, 0.2, 0.1, 0.4, 0.2)
var3 <- c(0.2, 0.6, 0.9, 0.5, 0.5, 0.2, 0.5, 0.5, 0.2)
df <- data.frame(Name, var1, var2, var3)
df

I split my dataframe in order to apply a function to every group.

list_split= split(df[,2:4],df$Name)

my_list=vector("list",3)
for (i in seq_along(list_split)){
  my_list[[i]]=list(
    lapply(list_split[[i]],function(x) summary(x)))
} 

After that I wrote a function so that if the mean of the values in 'my_list' is larger than 0.9, the difference of the values in 'split_list' is taken, and otherwise just the value. (Please ignore that the operation does not make any sense, my original function is very different.):

l <- list()
    fun <- function(x,y) {ifelse(mean(x) > 0.9,diff(y),y)}
    for (j in seq_along(list_split)){
      for (i in seq_along(my_list)){
        u <- mapply(fun,my_list[[i]][[1]],list_split[[j]], SIMPLIFY = FALSE)
        l[[j]] <- u
      }
    }

I want that the function is applied to all values of the 'var's in the dataframes in 'list_split'. For example for list_split[["Ben"]] the values are:

var1 var2 var3
4  0.3  0.5  0.5
5  0.9  0.4  0.5
6  0.2  0.2  0.2

But it is just applied to the first value of every 'var', so that the resulting list for the first element looks like this:

l[[1]]
$var1
[1] 0.3

$var2
[1] 0.5

$var3
[1] 0.5

So how can I apply the function to all values in every 'list_split' element and end up with a list that exactly preserves the structure of 'list_split', that is a list of dataframes?

Thank you!


Solution

  • We could try

    Map(\(x, y) {
       x[] <- Map(\(u, v) if(mean(v) > 0.9) c(NA, diff(u)) else u, x, y)
       x
        }, list_split, lapply(my_list, \(x) do.call("c", x)))
    

    -output

    $Ben
      var1 var2 var3
    4  0.3  0.5  0.5
    5  0.9  0.4  0.5
    6  0.2  0.2  0.2
    
    $Mary
      var1 var2 var3
    7  0.4  0.1  0.5
    8  0.6  0.4  0.5
    9  0.7  0.2  0.2
    
    $Peter
      var1 var2 var3
    1  0.4  0.5  0.2
    2  0.6  0.4  0.6
    3  0.7  0.2  0.9