Search code examples
rvectortranspose

Transpose elements in a column into multiple columns by group in R


Given the data

test_id <- c(1, 1, 1, 2, 2, 2)
test_values <- c(1, 2, 3, 4, 5, 6)
test_df <- data.frame(test_id, test_values)
test_df

  test_id       test_values
        1           1
        1           2
        1           3
        2           4
        2           5
        2           6

I would like to transpose the value column into a row

test_df$x1 <- c(1, 1, 1, 4, 4, 4)
test_df$x2 <- c(2, 2, 2, 5, 5, 5)
test_df$x3 <- c(3, 3, 3, 6, 6, 6)
test_df

 test_id        test_values   x1           x2           x3
        1           1          1            2            3
        1           2          1            2            3
        1           3          1            2            3
        2           4          4            5            6
        2           5          4            5            6
        2           6          4            5            6

I originally had in mind something like

test_df <- test_df %>%
  group_by(test_id) %>%
  mutate(X = t(test_values))%>%
  ungroup()

but this does not allow splitting the column into multiple entries - x1,x2,x3. Any suggestion about how to tackle the issue would be much appreciated!


Solution

  • You could do it with pivots and joins:

    test_id <- c(1, 1, 1, 2, 2, 2)
    test_values <- c(1, 2, 3, 4, 5, 6)
    test_df <- data.frame(test_id, test_values)
    
    test_df %>% 
      group_by(test_id) %>% 
      mutate(var = seq_along(test_values)) %>% 
      pivot_wider(values_from="test_values", names_from="var", names_prefix="x") %>% 
      left_join(test_df %>% dplyr::select(test_id, test_values), .)
    # Joining, by = "test_id"
    #   test_id test_values x1 x2 x3
    # 1       1           1  1  2  3
    # 2       1           2  1  2  3
    # 3       1           3  1  2  3
    # 4       2           4  4  5  6
    # 5       2           5  4  5  6
    # 6       2           6  4  5  6
    

    Or, if you only wanted one row per test_id you could just remove the last line:

    test_df %>% 
      group_by(test_id) %>% 
      mutate(var = seq_along(test_values)) %>% 
      pivot_wider(values_from="test_values", names_from="var", names_prefix="x") 
    # # A tibble: 2 x 4
    # # Groups:   test_id [2]
    #   test_id    x1    x2    x3
    #      <dbl> <dbl> <dbl> <dbl>
    # 1       1     1     2     3
    # 2       2     4     5     6