Search code examples
rdataframetext

Coalescing cols where col's names vary


I have a df looking somewhat like the following:

Var1_E1_C1 <- c(1, NA, NA, 1, 1)
Var1_E1_C2 <- c(NA, 3, 2, NA, NA)
Var23_E2_C1 <- c(NA, 2, NA, 1, 1)
Var23_E2_C2 <- c(3, NA, 1, NA, NA)

data.frame(Var1_E1_C1, Var1_E1_C2, Var23_E2_C1, Var23_E2_C2)

Now I want to coalesce Var1_E1* cols to create one single Var1_E1 col and the same for Var23. Preferably with the least amount of coding as possible since my df actually has 40 pairs like this.

I am hoping to get something like this.

Var1_E1 <- c(1, 3, 2, 1, 1)
Var23_E2<- c(3, 2, 1, 1, 1)

data.frame(Var_E1, Var23_E2) 

Solution

  • Not a beautiful solution, but it would generalize

    Var1_E1_C1 <- c(1, NA, NA, 1, 1)
    Var1_E1_C2 <- c(NA, 3, 2, NA, NA)
    Var23_E2_C1 <- c(NA,2 , NA, 1, 1)
    Var23_E2_C2 <- c(3,NA , 1, NA, NA)
    
    library(dplyr)
    library(tidyr)
    
    df <- data.frame(Var1_E1_C1, Var1_E1_C2, Var23_E2_C1, Var23_E2_C2)
    
    df %>% 
      mutate(id =  row_number()) %>% 
      pivot_longer(cols = -id) %>% 
      mutate(name = sub(paste0("_C",".*"),"",name)) %>% 
      filter(!is.na(value)) %>% 
      pivot_wider(names_from = name,values_from = value)