Search code examples
rpie-chart

how to draw pie chart with a descending order and showing only the top 8 portion in percentage?


I have a set of data of about 30 different jobs. Each job has various numbers of subjects.

Here is the dput(pie_data) of my data:

structure(list(job = structure(1:33, levels = c("08", "09", "17", 
"18", "19", "20", "23", "24", "25", "26", "27", "28", "31", "34", 
"35", "36", "37", "38", "43", "46", "48", "63", "66", "71", "72", 
"76", "78", "83", "85", "86", "88", "94", "96"), class = "factor"), 
    N = c(41L, 11L, 203L, 162L, 224L, 28L, 3L, 56L, 2L, 176L, 
    4L, 13L, 108L, 31L, 3L, 11L, 32L, 30L, 395L, 25L, 2L, 1L, 
    1L, 155L, 79L, 9L, 15L, 71L, 389L, 12L, 21L, 2L, 3L)), class = "data.frame", row.names = c(NA, 
-33L))

I want to draw a pie chart with a descending order and showing only important information (such as top 8 jobs in percentage and their job code) in my pie chart.

Here's what I've tried and the pie chart I've generated

ggplot(pie_data, aes(x="", y= N, fill=job)) +
   geom_bar(stat="identity", width=1, color = "white") +
   coord_polar("y", start=0) +
   theme_void()

enter image description here

  1. Althought I like how colorful and rainbow-like the pie chart is, I found it was ordered based on the job code instead of the number of subjects in each job.

  2. It's hard to recognize witch color matches its job code because there are just too many jobs in my data. I wish maybe the top 8 popular jobs can have ther name (job code) and how many percentage in proportion shown on the pie chart, while other less popular jobs keep the same way as the image above.

I tried using excel to plot the pie chart I imagined, here's the look: enter image description here

the "descending order" means the items in the pie chart are arranged from most to least.

ggplot2 doesn't seem to have an intuitive way to draw a pie chart. If there is any package recommended I would be very happy to know.


Solution

  • You may try this:

    library(forcats)
    library(ggplot2)
    library(dplyr)
    pie_data %>%
      mutate(
        job = fct_infreq(job, w = N), 
        label = ifelse(xtfrm(job)<=8, paste0(job,"\n",sprintf("%1.0f%%",100*N/sum(N))),"")
      ) %>%
      ggplot(aes(x = 1, y = N, fill = job)) +
      geom_col(width = 1, color = "white") +
      geom_text(aes(x = 1.3, label = label), position = position_stack(vjust = 0.5)) +
      coord_polar(theta = "y", direction = -1) +
      theme_void() + theme(aspect.ratio = 1)
    

    Here:

    • forcats::fct_infreq() reorders the categories.
    • ifelse(xtfrm(job)<=8,...,...) selects creates label text for the first 8 jobs
    • ggplot2::geom_col() is equivalent to geom_bar(stat="identity")
    • position = position_stack(vjust = 0.5) places the text in the correct position
    • To adjust the relative position of the text, change the x value in "geom_text(aes(x = 1.3))"

    the pie chart