Search code examples
rdataframereshape

How to reshape to long format, removing NAs cells?


I have a table with social researchers and revisiters in places that didn't get service on the first try, I need to reshape my dataframe to long format without whitespace or NAs, that seemed easy to me, but I'm not getting success in my attempts.

  zone situation Researcher1 Researcher2 Researcher3 Researcher4 revisit1 revisit2 revisit3
1    3  Answered          NA          NA           4          NA       NA       10       NA
2    3   Refusal          NA          NA          15          NA       NA        5       NA
3    1  Answered          10          NA          NA          NA        2       NA       NA
4    1   Refusal           7          NA          NA          NA        5       NA       NA
5    2  Answered          NA          15          NA          NA        3       NA       NA
6    2   Refusal          NA           3          NA          NA        0       NA       NA
7    4  Answered          NA          NA          NA          13       NA       NA        4
8    4   Refusal          NA          NA          NA           8       NA       NA        4


long_rsch <- reshape(survey, direction="long", timevar="Situation", 
                     idvar="zone", 
                     varying=list(c("Researcher 1", "Researcher 2", 
                                    "Researcher 3", "Researcher 4"), 
                                  c("revisit 1", "revisit 2", "revisit3")), 
                     v.names=c("first team", "Revisit"))

Gives me:

Error in reshapeLong(data, idvar = idvar, timevar = timevar, varying = varying,  : 
  'varying' arguments must be the same length

I expected something like this

enter image description here


Solution

  • library(tidyverse)
    
    df <- structure(list(zone = c(3, 3, 1, 1, 2, 2, 4, 4), situation = c("Answer", 
    "Refusal", "Answer", "Refusal", "Answer", "Refusal", "Answer", 
    "Refusal"), `Researcher 1` = c(NA, NA, 10, 7, NA, NA, NA, NA), 
        `Researcher 2` = c(NA, NA, NA, NA, 15, 3, NA, NA), `Researcher 3` = c(4, 
        15, NA, NA, NA, NA, NA, NA), `Researcher 4` = c(NA, NA, NA, 
        NA, NA, NA, 13, 8), `revisit 1` = c(NA, NA, 2, 5, 3, 0, NA, 
        NA), `revisit 2` = c(10, 5, NA, NA, NA, NA, NA, NA), `revisit 3` = c(NA, 
        NA, NA, NA, NA, NA, 4, 4)), class = "data.frame", row.names = c(NA, 
    -8L))
    
    df <- df %>% 
      pivot_longer(cols = contains("Researcher"), names_to = "Researchers", values_to = "Result 1") %>% 
      pivot_longer(cols = contains("revisit"), names_to = "revisit", values_to = "Result 2", values_drop_na = TRUE) %>%
      arrange(zone)
    
    # A tibble: 8 × 6
       zone situation Researchers  `Result 1` revisit   `Result 2`
      <dbl> <chr>     <chr>             <dbl> <chr>          <dbl>
    1     1 Answer    Researcher 1         10 revisit 1          2
    2     1 Refusal   Researcher 1          7 revisit 1          5
    3     2 Answer    Researcher 2         15 revisit 1          3
    4     2 Refusal   Researcher 2          3 revisit 1          0
    5     3 Answer    Researcher 3          4 revisit 2         10
    6     3 Refusal   Researcher 3         15 revisit 2          5
    7     4 Answer    Researcher 4         13 revisit 3          4
    8     4 Refusal   Researcher 4          8 revisit 3          4
    
    

    You could also then do a bit more cleaning to get to this:

    mutate(df, situation = as.factor(situation), 
               Researchers = as.numeric(str_remove(Researchers, "Researcher ")),
                  revisit = as.numeric(str_remove(revisit, "revisit ")))
    
    # A tibble: 8 × 6
       zone situation Researchers `Result 1` revisit `Result 2`
      <dbl> <fct>           <dbl>      <dbl>   <dbl>      <dbl>
    1     1 Answer              1         10       1          2
    2     1 Refusal             1          7       1          5
    3     2 Answer              2         15       1          3
    4     2 Refusal             2          3       1          0
    5     3 Answer              3          4       2         10
    6     3 Refusal             3         15       2          5
    7     4 Answer              4         13       3          4
    8     4 Refusal             4          8       3          4