Search code examples
rdataframedplyrtidyversetidyr

Assign Column Value to Column Names and Pivot it & Group Them


I am sorry for the messy title, I don't know how to describe it. Anyways, this is my current dataframe (1. Dataframe):

# A tibble: 6 x 14
# Groups:   Full.Name [3]
  Full.Name             year   `1`   `2`   `3`   `4`   `5`   `6`   `7`   `8`   `9`  `10`  `11`  `12`
  <chr>                <dbl> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int> <int>
1 A. Patrick Beharelle  2019   556   577   628   608   547   429   371   338   409   330   448   300
2 A. Patrick Beharelle  2020   220   213   102    58   321   233   370   334   406   331   559   122
3 Aaron P. Graft        2020    48     0     0    45     0     0    16     0    91     0     0     0
4 Aaron P. Graft        2019     0     0     0    14     0    83     0     0     0    35    33     0
5 Aaron P. Jagdfeld     2020     0     0     0     0     0     0     0     0     5     0     0     0
6 Aaron P. Jagdfeld     2019     0     0     0     0     0     0     0     0     0     0     0     4

I pivoted it from this (2. Dataframe):

# A tibble: 6 x 4
# Groups:   Full.Name, month [3]
  Full.Name            month  year counter
  <chr>                <dbl> <dbl>   <int>
1 A. Patrick Beharelle     1  2019     556
2 A. Patrick Beharelle     1  2020     220
3 A. Patrick Beharelle     2  2019     577
4 A. Patrick Beharelle     2  2020     213
5 A. Patrick Beharelle     3  2019     628
6 A. Patrick Beharelle     3  2020     102

by using:

data <- data|> pivot_wider(names_from = month, values_from = counter)

As you can see there are always duplicate entries for each name due to my timeframe being two years. Now I want to format it, so that the columns are "1_2019", "2_2019", [...], "1_2020", "2_2020", [...]. I tried to pivot the current dataframe, but I cannot do it. I want to have 24 columns for each Name.

I would appreciate every help, due to it being really specific, I did not know what to search for and therefore I'll thank in advance for helping me.

This is the dput() output: 1st DataFrame

 structure(list(Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", 
    "Aaron P. Graft", "Aaron P. Graft", "Aaron P. Jagdfeld"), year = c(2019, 
    2020, 2020, 2019, 2020), `1` = c(556L, 220L, 48L, 0L, 0L), `2` = c(577L, 
    213L, 0L, 0L, 0L), `3` = c(628L, 102L, 0L, 0L, 0L), `4` = c(608L, 
    58L, 45L, 14L, 0L), `5` = c(547L, 321L, 0L, 0L, 0L), `6` = c(429L, 
    233L, 0L, 83L, 0L), `7` = c(371L, 370L, 16L, 0L, 0L), `8` = c(338L, 
    334L, 0L, 0L, 0L), `9` = c(409L, 406L, 91L, 0L, 5L), `10` = c(330L, 
    331L, 0L, 35L, 0L), `11` = c(448L, 559L, 0L, 33L, 0L), `12` = c(300L, 
    122L, 0L, 0L, 0L)), class = c("grouped_df", "tbl_df", "tbl", 
    "data.frame"), row.names = c(NA, -5L), groups = structure(list(
        Full.Name = c("A. Patrick Beharelle", "Aaron P. Graft", "Aaron P. Jagdfeld"
        ), .rows = structure(list(1:2, 3:4, 5L), ptype = integer(0), class = c("vctrs_list_of", 
        "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
    ), row.names = c(NA, -3L), .drop = TRUE))
  1. Dataframe:

     structure(list(Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", 
     "A. Patrick Beharelle", "A. Patrick Beharelle", "A. Patrick Beharelle"
     ), month = c(1, 1, 2, 2, 3), year = c(2019, 2020, 2019, 2020, 
     2019), counter = c(556L, 220L, 577L, 213L, 628L)), class = c("grouped_df", 
     "tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L), groups = structure(list(
         Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", 
         "A. Patrick Beharelle"), month = c(1, 2, 3), .rows = structure(list(
             1:2, 3:4, 5L), ptype = integer(0), class = c("vctrs_list_of", 
         "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
     ), row.names = c(NA, -3L), .drop = TRUE)) 
    

Solution

  • xx <- structure(list(Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", "A. Patrick Beharelle", "A. Patrick Beharelle", "A. Patrick Beharelle" ), month = c(1, 1, 2, 2, 3), year = c(2019, 2020, 2019, 2020, 2019), counter = c(556L, 220L, 577L, 213L, 628L)), class = c("grouped_df", "tbl_df", "tbl", "data.frame"), row.names = c(NA, -5L), groups = structure(list( Full.Name = c("A. Patrick Beharelle", "A. Patrick Beharelle", "A. Patrick Beharelle"), month = c(1, 2, 3), .rows = structure(list( 1:2, 3:4, 5L), ptype = integer(0), class = c("vctrs_list_of", "vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame" ), row.names = c(NA, -3L), .drop = TRUE))
    
    tidyr::pivot_wider(xx, names_from = c('month', 'year'), values_from = 'counter')
    

    Output:

    # A tibble: 1 x 6
    # Groups:   Full.Name [1]
      Full.Name            `1_2019` `1_2020` `2_2019` `2_2020` `3_2019`
      <chr>                   <int>    <int>    <int>    <int>    <int>
    1 A. Patrick Beharelle      556      220      577      213      628
    

    Something like that?