Search code examples
rdplyrpie-chart

how to make pie charts with percentages for multiple columns in R


I want to have multiple pie charts per column and across two groups.

My data looks like this:

mydf <- data.frame(ID, age, gender, diagnosis, A1, A2, A3)
mydf
ID age gender diagnosis A1 A2 A3
a  22 female         1  A  D  B
b  34   male         2  B  E  NA
c  55 female         2  C  B  A
d  55 female         1  NA A  B
e  45   male         1  E  D  B

here A1, A2, and A3 refer to the questions in the test I applied to participants and the letters below represent the answer chosen by the participant. So, it is a categorical value. Diagnosis 1 and 2 refer to whether the participant is diagnosed or not with the illness.

I want to calculate the percentage-wise frequencies for each question(omitting NA values) and make a pie chart by diagnosis. Is there any code you can think of that would provide me with multiple pie charts at one go or should I write a separate code for each question(I have so many questions, that's why I am asking? :D)

So, in the end, I would like to have pie charts like this next to each other for each question:

representative pie chart

Thank you so much!


Solution

  • The data:

    library(tidyverse)
    
    mydf <- tribble(
      ~ID, ~age, ~gender, ~diagnosis, ~A1, ~A2, ~A3,
      "a",  22, "female",         1,  "A",  "D",  "B",
      "b",  34,   "male",         2,  "B",  "E",  NA,
      "c",  55, "female",         2,  "C",  "B",  "A",
      "d",  55, "female",         1,  NA, "A",  "B",
      "e",  45,   "male",         1,  "E",  "D",  "B",
    )
    

    Pivot the data and calculate percentages using the table() and prop.table(). The factor is used so we can keep all the options.

    
    mydf_ <- mydf |> 
      pivot_longer(cols = c(A1, A2, A3), names_to = "question", values_to = "answer") |> 
      group_by(diagnosis, question) |> 
      mutate(answer = factor(answer, levels = c("A", "B","C", "D", "E"))) |> 
      summarise(percentage = c(prop.table(table(answer)))) |> 
      mutate(answer = c("A", "B","C", "D", "E")) 
    

    Create the pie chart using geom_bar(), geom_text() and coord_polar():

    mydf_ |> 
      mutate(percentage = ifelse(percentage == 0, NA, percentage)) |> 
      ggplot(aes(x = "", y = percentage, 
                 fill = answer, 
                 label = scales::percent(percentage))) +
        geom_col() +
        geom_text(color = "white", size=6, 
                  position = position_stack(vjust = 0.5)) +
        coord_polar("y") +
        facet_grid(diagnosis ~ question, switch = "both") +
        labs(x = "Diagnosis", y = "Questions") +
        theme(axis.text  = element_blank(),
              axis.ticks = element_blank()
             )
    

    Result:

    Pie chart