Search code examples
rdataframeaggregate

R - How to put aggregated row results as columns


Suppose there is dataframe df

age category
12  A
15  B
12  A
13  C
14  B
14  D

I want to use aggregate to find the number of occurrences in each category {A, B, C, D} for each age. The number of A, B, C and D respectively are established as columns so the output data frame should look like

age A B C D
12  2 0 0 0
13  0 0 1 0
14  0 1 0 1
15  0 1 0 0

Attempt

agdf <- aggregate(df, by=list(df$age, df$category), FUN=length)

But doing this only gives me

age category x
12  A        2
15  B        1
14  B        1
13  C        1
14  D        1

Another problem is that the original df has other columns but they have been omitted for simplicity. Yet with this aggregate approach using FUN=length, it turns all those other columns into the same count value as x. How can I keep those values?

E.g.

age category x  y  z
12  A        2  2  2
15  B        1  1  1
14  B        1  1  1
13  C        1  1  1
14  D        1  1  1

but I want y and z to keep their original values, only need 1 count column x

How to massage it to the desired structure?


Solution

  • xtabs(~., df1)
    
        category
    age  A B C D
      12 2 0 0 0
      13 0 0 1 0
      14 0 1 0 1
      15 0 1 0 0
    
    table(df1)
        category
    age  A B C D
      12 2 0 0 0
      13 0 0 1 0
      14 0 1 0 1
      15 0 1 0 0
    
    reshape2::dcast(df1, age~category)
      age A B C D
    1  12 2 0 0 0
    2  13 0 0 1 0
    3  14 0 1 0 1
    4  15 0 1 0 0
    
    pivot_wider(df1, id_cols = age, names_from = category,
                  values_from = category, values_fn = length, values_fill = 0)
    # A tibble: 4 × 5
        age     A     B     C     D
      <int> <int> <int> <int> <int>
    1    12     2     0     0     0
    2    15     0     1     0     0
    3    13     0     0     1     0
    4    14     0     1     0     1