Search code examples
rfor-loopqualtrics

Determining order of item in randomized set from Qualtrics using a for loop in R


In one Qualtrics survey of mine, each participant received a set of questions presented in a random order.

I now want to determine what position a question (see "Question" variable in table) had in the participant's randomized order of questions. Questions in the shortened example are numbered I1, I2, or I3.

The data are organized right now such that there are columns that correspond with order (in the shortened example below, "B1", "B2", and "B3"). That is, the question in column B1 appeared first for that participant.

Here is a file of the data (https://drive.google.com/open?id=1h18SlQ-gmRUZh93M5Y5T3TuE22yxSJbU), and here's what it looks like printed out in R:

> head(testd)
  Question B1 B2 B3
1       I1 I1 I2 I3
2       I1 I3 I2 I1
3       I2 I2 I3 I1
4       I2 I3 I1 I2
5       I3 I2 I1 I3
6       I3 I1 I3 I2

I now want to write a for loop to make a new variable "RandomizedOrder" in the dataframe testd that will tell me whether a question in the column "Question" (e.g., I1) for a participant was presented first (B1), second (B2), or third (B3). For example, in the example above, RandomizedOrder for row 1 should come out to be B1 because the value in column "Question" is I1, and the value in column "B1" is I1.

To do this, I first concatenated the values "B1", "B2", and "B3" together in "BSet".

testd <- read.csv("TestData.csv")
BSet <- c("B1", "B2", "B3")
testd[BSet]

I then wrote the following for loop. My goal: For each row i, if a certain value in one of the three BSet columns was the same as the value in the Question column, then the variable RandomizedOrder for that row should take on the column name of the value in one of the BSet columns that is the same as the value in the Question column.

For example, if testd$B1 = I1 in row 1, and testd$Question = I1 in row 1, then this for loop should make testd$RandomizedOrder equal to B1.

for (i in nrow(testd)) {
  for (j in 1:3) {
    if (testd[i,BSet][[j]] == testd$Question[i]) {
      testd$RandomizedOrder[i] <- colnames(testd[i,BSet][j])
    }
  }
}

This is what the R output looks like.

> head(testd$RandomizedOrder)
[1] NA   NA   NA   NA   NA   "B2"

I'm not sure why it produces NA values for everything except for the 6th item.

Here's what I wanted the for loop to do: Make a new variable named "RandomizedOrder" that indicated, for each row, which column contained the value found in the "Question" column.

      Question B1 B2 B3 RandomizedOrder
    1       I1 I1 I2 I3 B1
    2       I1 I3 I2 I1 B3
    3       I2 I2 I3 I1 B2
    4       I2 I3 I1 I2 B3
    5       I3 I2 I1 I3 B3
    6       I3 I1 I3 I2 B2

I looked through the code to make sure the individual parts would work out.

The code here comes out as being true (and both side of the equality signs produce the value I1):

testd[1,BSet][[1]] == testd$Question[1] [1] TRUE

I can also manually tell R to replace a value in testd$RandomizedOrder with a column name.

> testd$RandomizedOrder[1] <- colnames(testd[1,BSet][1])
> head(testd$RandomizedOrder)
[1] "B1" NA   NA   NA   NA   "B2"

Could someone please help me determine why the for loop isn't working?

Thank you in advance.

(Please note that this might seem like it could be done easily manually for this dataset with 6 observations, but this is a simplified example of my real dataset. My actual dataset has 48 questions (i.e., I1 through I48), and hundreds of observations. I've therefore indexed the number of columns represented by BSet using the letter j.)


Solution

  • Consider an lapply across dataframe column names for matches followed by Reduce for a coalesce method to reduce all columns into one for RandomizedOrder assignment.

    txt = "Question B1 B2 B3
    1       I1 I1 I2 I3
    2       I1 I3 I2 I1
    3       I2 I2 I3 I1
    4       I2 I3 I1 I2
    5       I3 I2 I1 I3
    6       I3 I1 I3 I2"
    
    testd <- read.table(text=txt, header=TRUE)
    
    colList <-  lapply(names(testd)[-1], function(i)
      ifelse(testd$Question == testd[[i]], i, NA))
    
    testd$RandomizedOrder <- Reduce(function(x, y) {
      x[which(is.na(x))] <- y[which(is.na(x))]
      x}, colList)
    
    testd    
    #   Question B1 B2 B3 RandomizedOrder
    # 1       I1 I1 I2 I3              B1
    # 2       I1 I3 I2 I1              B3
    # 3       I2 I2 I3 I1              B1
    # 4       I2 I3 I1 I2              B3
    # 5       I3 I2 I1 I3              B3
    # 6       I3 I1 I3 I2              B2