Search code examples
rggplot2bar-chartplotlygeom-bar

Sorting bars in a bar chart with ggplot2


First time asking here so forgive me if I'm not clear enough.

So far I have seen many replies to similar questions, which explain how to sort bars by some field of a data frame; but I've been not able to find how to sort them by the default stat "count" of geom_bar (which is obviously NOT a field of the data frame.) For example, I run this code:

library(ggplot2)

Name <- c( 'Juan','Michael','Andrea','Charles','Jonás','Juan','Donata','Flavia' )
City <- c('Madrid','New York','Madrid','Liverpool','Madrid','Buenos Aires','Rome','Liverpool')
City.Id <- c(1,2,1,3,1,4,5,3)
df = data.frame( Name,City,City.Id )

a <- ggplot( df,aes( x = City, text=paste("City.Id=",City.Id)) ) +
geom_bar()

ggplotly(a)

And then I would like to visualize the resulting bars ordered by their height (=count.) Note that I must keep the "City.Id" info to show in the final plot. How can this be done?


Solution

  • Given that you're already using ggplot2, I'd suggest looking into what else the tidyverse can offer. Namely the forcats package for working with factors.

    forcats has a nice function fct_infreq() which will (re)set the levels of a factor to be in the order of their frequency. If the data is a character vector not already a factor (like City is in your data) then it will first make it a factor, and then set the levels to be in frequency order.

    Try this code:

    # Load packages
    library(ggplot2)
    library(forcats)
    
    # Create data
    Name <- c( 'Juan','Michael','Andrea','Charles','Jonás','Juan','Donata','Flavia' )
    City <- c('Madrid','New York','Madrid','Liverpool','Madrid','Buenos Aires','Rome','Liverpool')
    City.Id <- c(1,2,1,3,1,4,5,3)
    df = data.frame( Name,City,City.Id )
    
    # Create plot
    a <- ggplot(df, aes(x = fct_infreq(City), text=paste("City.Id=",City.Id)) ) +
      geom_bar()
    
    a