Search code examples
rfor-loopvectorseries

How to find a series of number in a vector using a loop for - R


I am trying to find a way to check if a series of number is present within a vector.

I am studying correlations within a dataframe, and I am using a 'for' loop to plot all pairs of variables. However, I am trying to find a way to avoid each pair to be plotted twice. I had the idea of creating sequences of each columns' indexes, store them in a vector in order to check, before each plot, if the series is already present in the vector, and order the loop to skip it if it is present.

An example of what I would like to have:

Let's say I have the following variables in the dataframe 'DATA':

Var1 Var2 Var3 Var4
0 1 0 4
2 3 2 5
5 4 5 0
2 5 1 1

I want to plot each variable according to each variable, but if I do a loop I will end up with each pair twice like in the following example:


for (i in 1:4){
  for (j in 1:4){
      if (j == i){
      next
    }
      plot(x = DATA[,i],
         y = DATA[,j],
         xlab = colnames(DATA)[i],
         ylab = colnames(DATA)[j])
  }
}

This example would give me twice each pair of variable, once with Var1 as x and Var2 as y, and the second with Var1 as y and Var2 as x, and so on for each pair of variables.

I want to avoid this as I have several dozens of variables in my original dataframe. So I want to create a series of number with both indexes, to store in a vector that is searched at the beginning of each loop and if the series is found, the loop skip to the next iteration.

I tried the following, which did not work:

vector_test <- c(0)

for (i in 1:4){
  for (j in 1:4){
      test1 <- c(0,i,j,0)
      test2 <- c(0,j,i,0) #to have both orders possible
      if (j == i){
          next
       }
       if (test1 %in% vector_test){
          next
       }
       if (test2 %in% vector_test){
          next
       }
       vector_test <- c(vector_test, test1, test2) #adding to the test vector to check in the next iteration
        plot(x = Data_total_VF[,i],
             y = Data_total_VF[,j],
             xlab = colnames(Data_total_VF)[i],
             ylab = colnames(Data_total_VF)[j])
      }
    }

I added the 0 at the end and beginning of "tests" to avoid a skip due to two numbers being randomly next to each other in the vector.

I also tried with :

if ((test1 %in% vector_test) == TRUE{
          next
       }

The error it gives me both times is:

Error in if (test1 %in% vector_test) { : the condition has length > 1

Error in if ((test1 %in% vector_test) == TRUE) { : 
  the condition has length > 1

And I have been unable to find another operator, or another example in this website, to do this.

Does anyone have an idea ?

Thank you very much.


Solution

  • Using combn:

    combns <- combn(1:4, 2, simplify = FALSE)
    
    for (ij in combns){
      i <- ij[[1]]
      j <- ij[[2]]
      plot(x = DATA[,i],
           y = DATA[,j],
           xlab = colnames(DATA)[i],
           ylab = colnames(DATA)[j])
    }