Search code examples
rdataframetibble

How can i convert a data frame with only factors into table with proportions in R using dplyr functions?


i have the following data frame in R with 3 factor column with the same levels and and a column Year with 4 different years.

A B C Year
1 1-Strongly disagree 5-Strongly agree 1-Strongly disagree 2019
2 3-so-so 5-Strongly agree 2-Disagree 2019
3 4-Agree 2-Disagree 2-Disagree 2022
4 2-Disagree 2-Disagree 5-Strongly agree 2022

and i want to use dplyr functions in order to alter my data frame into like this :

Group Item 1-Strongly disagree 2-Disagree 3-so-so 4-Agree 5-Strongly agree
Group Item 1-Strongly disagree 2-Disagree 3-so-so 4-Agree 5-Strongly agree
2019 A 26.72 21.41 27.45 17.80 6.59
2019 B 25.0 20.2 28.27 18.36 8.0 2019 6.30 28.6 15.17

where the numbers on the cells are atribrary.

A = sample(c('1-Strongly disagree','2-Disagree','3-so-so','4-Agree','5-Strongly agree'),12933,replace = TRUE)
B = sample(c('1-Strongly disagree','2-Disagree','3-so-so','4-Agree','5-Strongly agree'),12933,replace = TRUE)
C = sample(c('1-Strongly disagree','2-Disagree','3-so-so','4-Agree','5-Strongly agree'),12933,replace = TRUE)
Year = sample(c("2019","2020","2022","2023"),12933,replace = TRUE)
df = tibble(A,B,C,Year)%>%
  mutate(across(everything(),as.factor));df

Solution

  • You have not mentioned where does the Item column come from. So, assuming Item means A, B, and C, I believe you want something like this.

    library(tidyverse)
    df %>% 
      gather(item, score, -Year) %>% 
      count(Year, item, score) %>% 
      group_by(Year, item) %>% 
      mutate(pct= round(prop.table(n)*100,1)) %>% 
      select(-n) %>% 
      spread(score, pct)
    
    
       Year  item  `1-Strongly disagree` `2-Disagree` `3-so-so` `4-Agree` `5-Strongly agree`
       <fct> <chr>                 <dbl>        <dbl>     <dbl>     <dbl>              <dbl>
     1 2019  A                      19.2         20.9      19.5      19.4               21  
     2 2019  B                      19.9         20.3      20        20.3               19.5
     3 2019  C                      19.7         20.4      20.5      18.3               21.2
     4 2020  A                      20.5         19.2      19.4      20                 20.9
     5 2020  B                      20.8         19.3      19.6      19.2               21.1
     6 2020  C                      20           20.5      19.2      20.5               19.7
     7 2022  A                      20           19.8      19.6      20.1               20.5
     8 2022  B                      20.2         18.8      19.9      20.6               20.5
     9 2022  C                      20.3         20.1      19.9      19.7               19.9
    10 2023  A                      19.5         20.2      18.9      20.6               20.7
    11 2023  B                      21           19.3      20        18.9               20.7
    12 2023  C                      18.8         20.1      20.9      20.1               20.1