Search code examples
rnested-loopslapplysapply

Using sapply within lapply only on certain variables/nested loop


I would like to adress my question in two steps in order for me to understand the way of coding better.

Part 1:

Suppose I have a list like this:

x <- data.frame(replicate(5,sample(0:100,10,rep=TRUE)))

y <- data.frame(replicate(5,sample(0:100,10,rep=TRUE)))

z <- list(x, y)

I would like to get the range of numbers of each column of each list. I did not really understand how I can combine lapply with sapply to get results for each column of the dataframes within my list. Any ideas on how I could do this?

All I could get to run properly is if I wanted to get the range of let's say the third column of the first list element:

range(z[[1]][[2]])

Part 2:

The second part of my question goes a bit further. This time I will add a column with characters to my dataframes.

a <- data.frame(replicate(5,sample(0:100,10,rep=TRUE)))
a$x6 <- letters[1:10]

b <- data.frame(replicate(5,sample(0:100,10,rep=TRUE)))
b$x6 <- letters[1:10] 

c <- list(x, y)

I would like to get the range of numbers of each column of each list except for column 6 which is not numeric. I do not want to delete this column but would rather only query the numeric columns.

Any ideas on how I could do this efficiently? I presume a combination of lapply and sapply would be the best.

If you have an idea on how to do this with a nested loop, that would also be interesting to know. Maybe the second part also only works with a nested loop...


Solution

  • The first can be done with

    lapply(z, function(a) sapply(a,range))
    
    [[1]]
         X1 X2 X3 X4 X5
    [1,]  2 13 28  2  3
    [2,] 95 97 98 99 85
    
    [[2]]
         X1 X2 X3 X4  X5
    [1,]  7  4 16  5  19
    [2,] 90 90 82 84 100
    

    The second can be done with

    lapply(c, function(a) sapply(a[sapply(a,is.numeric)],range))
    
    [[1]]
         X1 X2 X3 X4 X5
    [1,]  7  5  0  3  8
    [2,] 97 81 96 93 94
    
    [[2]]
         X1 X2  X3 X4 X5
    [1,]  8  4   0  9  7
    [2,] 72 90 100 99 94
    

    The internal sapply in this one creates a logical vector of the column indices of each element of c that are numeric, so it will keep the character column out of the loop.

    By the way, it is a bad idea to use c as a variable name in R, as it is also a common function!