Search code examples
rcopycellreplicate

R - copy values from cells to empty cells within same columns


I have a dataset with answers to a likert scale and reaction times that are both results of a experimental manipulation. Ideally I would like to copy the Likert_Answer values and align them to the experimental manipulation associated with that value.

The dataset looks like this:

x <- rep(c(NA, round(runif(5, min=0, max=100), 2)), times=3)

myDF <- data.frame(ID = rep(c(1,2,3), each=6),
               Condition = rep(c("A","B"), each=3, times=3),
               Type_of_Task = rep(c("Test", rep(c("Experiment"), times=2)), times=6),
               Likert_Answer = c(5, NA, NA, 6, NA, NA, 1, NA, NA, 5, NA, NA, 5, NA, NA, 1, NA, NA),
               Reaction_Times = x)

I find it very hard to formulate the problem I have, so this is how my expected output should look like:

myDF_Output <- data.frame(ID = rep(c(1,2,3), each=6),
               Condition = rep(c("A","B"), each=3, times=3),
               Type_of_Task = rep(c("Test", rep(c("Experiment"), times=2)), times=6),
               Likert_Answer = rep(c(5, 6, 1, 5, 5, 1), each = 3),
               Reaction_Times = x)

I have seen in this post a feasible solution that is the following:

library(dplyr)
library(tidyr)

myDF2 <- myDF %>% 
  group_by(ID) %>% 
  fill(Likert_Answer) %>% 
  fill(Likert_Answer, .direction = "up")

The problem is that this solution is valid as far as a person replies to the likert scale. If that was not the case, I am afraid this solution would "drag" the result of the likert scale of the previous one experimental condition. For example:

myDF_missing <- myDF
myDF_missing[4,4] = NA

myDF3 <- myDF_missing %>% 
  group_by(ID) %>% 
  fill(Likert_Answer) %>% 
  fill(Likert_Answer, .direction = "up")

In this case, what should have been a NA in Likert_Scales for all values in condition B for ID 1 has become a 5. Any idea of how could avoid this?

(Excuse me if the code is dirty: I am quite new to R and I am learning the hard way... But I got pretty stuck with this problem at this stage.)


Solution

  • if I understood your problem correctly you are very close to a solution. I manipulated the demo df to show how the grouping works:

    library(dplyr)
    library(tidyr)
    
    myDF <- data.frame(ID = rep(c(1,2,3), each=6),
                       Condition = rep(c("A","B"), each=3, times=3),
                       Type_of_Task = rep(c("Test", rep(c("Experiment"), times=5)), times=3),
                       Likert_Answer = c(5, NA, NA, 6, NA, NA, 1, NA, NA, 5, NA, NA, NA, NA, NA, 1, NA, NA),
                       Reaction_Times = x)
    
    
    myDF %>% 
      dplyr::group_by(ID) %>% 
      tidyr::fill(Likert_Answer)
    
          ID Condition Type_of_Task Likert_Answer Reaction_Times
       <dbl> <chr>     <chr>                <dbl>          <dbl>
     1     1 A         Test                     5           NA  
     2     1 A         Experiment               5           18.4
     3     1 A         Experiment               5           41.1
     4     1 B         Experiment               6           59.8
     5     1 B         Experiment               6           93.4
     6     1 B         Experiment               6           38.5
     7     2 A         Test                     1           NA  
     8     2 A         Experiment               1           18.4
     9     2 A         Experiment               1           41.1
    10     2 B         Experiment               5           59.8
    11     2 B         Experiment               5           93.4
    12     2 B         Experiment               5           38.5
    13     3 A         Test                    NA           NA  
    14     3 A         Experiment              NA           18.4
    15     3 A         Experiment              NA           41.1
    16     3 B         Experiment               1           59.8
    17     3 B         Experiment               1           93.4
    18     3 B         Experiment               1           38.5