Search code examples
rdataframedata-manipulationtranspose

Add column with row values based on a category to R dataframe


so I have a dataframe of the following structure, let's call this one df0:

year category a b c d
1989 1 0.3 0.7 0.43 321
1989 1 0.3 0.7 0.43 321
1989 2 0.2 0.4 0.5 174
1989 2 0.2 0.4 0.5 174
1989 2 0.2 0.4 0.5 174
1989 3 0.6 0.2 3.0 224
1990 1 0.6 0.2 3.0 93
1990 1 0.6 0.2 3.0 93
1990 2 0.3 0.7 4.0 293
1990 3 0.9 0.6 2.0 13

What I need to turn this into is the following. Basically, I want to add for each year a column with the a c value for each category. Like this:

year category a b c d c1 c2 c3
1989 1 0.3 0.7 0.43 321 0.43 0.5 3.0
1989 1 0.3 0.7 0.43 321 0.43 0.5 3.0
1989 2 0.2 0.4 0.5 174 0.43 0.5 3.0
1989 2 0.2 0.4 0.5 174 0.43 0.5 3.0
1989 2 0.2 0.4 0.5 174 0.43 0.5 3.0
1989 3 0.6 0.2 3.0 224 0.43 0.5 3.0
1990 1 0.6 0.2 3.0 93 3.0 4.0 2.0
1990 1 0.6 0.2 3.0 93 3.0 4.0 2.0
1990 2 0.3 0.7 4.0 293 3.0 4.0 2.0
1990 3 0.9 0.6 2.0 13 3.0 4.0 2.0

I cannot figure out how to compute this. My first Idea would be to create sub dataframes for each year and then create a vector of each c value from this, but this seems very tedious and I cannot get it to work.

Does anyone have input or a solution on this?

KR


Solution

  • Using tidyr and dplyr, pivot then join:

    library(tidyr)
    library(dplyr)
    
    cvals <- df0 %>%
      distinct(year, category, c) %>%
      pivot_wider(
        names_from = category,
        names_prefix = "c",
        values_from = c
      )
    
    left_join(df0, cvals, join_by(year))
    

    Result:

    # A tibble: 10 × 9
        year category     a     b     c     d    c1    c2    c3
       <dbl>    <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
     1  1989        1   0.3   0.7  0.43   321  0.43   0.5     3
     2  1989        1   0.3   0.7  0.43   321  0.43   0.5     3
     3  1989        2   0.2   0.4  0.5    174  0.43   0.5     3
     4  1989        2   0.2   0.4  0.5    174  0.43   0.5     3
     5  1989        2   0.2   0.4  0.5    174  0.43   0.5     3
     6  1989        3   0.6   0.2  3      224  0.43   0.5     3
     7  1990        1   0.6   0.2  3       93  3      4       2
     8  1990        1   0.6   0.2  3       93  3      4       2
     9  1990        2   0.3   0.7  4      293  3      4       2
    10  1990        3   0.9   0.6  2       13  3      4       2