Search code examples
rloopsconcatenation

How to make R loop that concatenates multiple times across columns based on a condition


I have a dataframe similar to the following:

    df <- data.frame(name = c("john", "sara", "bill"),
                 join_0_year = c(1990,1991,1992),
                 join_0_month = c(1,2,3),
                 join_0_day = c(20,21,22),
                 join_0_team = c("tigers", "lions", "bears"),
                 join_1_year = c(2000, 2001, 2002),
                 join_1_month = c(8,9,10),
                 join_1_day = c(14,15,16),
                 join_1_team = c("pirates", "colts", "panthers"))
df

  name join_0_year join_0_month join_0_day join_0_team join_1_year join_1_month join_1_day join_1_team
1 john        1990            1         20      tigers        2000            8         14     pirates
2 sara        1991            2         21       lions        2001            9         15       colts
3 bill        1992            3         22       bears        2002           10         16    panthers

My goal is to concatenate the date columns into a single column. My df has hundreds of columns ("join_0" through "join_400") so manually calling columns by name is out of the question. I imagine I need to make some kind of for-loop, which I struggle with. My desired outcome is:

  name join_0_date join_0_team join_1_date join_1_team
1 john   1990-1-20      tigers   2000-8-14     pirates
2 sara   1991-2-21       lions   2001-9-15       colts
3 bill   1992-3-22       bears  2002-10-16    panthers

Any help is greatly appreciated.


Solution

  • Here is a tidyverse option:

    library(tidyverse)
    df %>%
        pivot_longer(matches("year|month|day"),  names_to = c("var", "quant"), names_pattern = "(join_\\d)_(.+)") %>%
        group_by(across(-c(quant, value))) %>%
        summarise(value = str_c(value, collapse = "-"), .groups = "drop") %>%
        pivot_wider(names_from = var, names_glue = "{var}_date")
    ## A tibble: 3 × 5
    #  name  join_0_team join_1_team join_0_date join_1_date
    #  <chr> <chr>       <chr>       <chr>       <chr>      
    #1 bill  bears       panthers    1992-3-22   2002-10-16 
    #2 john  tigers      pirates     1990-1-20   2000-8-14  
    #3 sara  lions       colts       1991-2-21   2001-9-15  
    

    Explanation: The idea is to reshape date-related data from wide to long, then combine year/month/day into a date, and then reshape again from long to wide.