Search code examples
rggplot2data-visualizationscaleordinal

Scale/Position R ggplot2 visualization: don't know what package to use


I had an idea for a visualization, that includes generating a plot for each row in my dataset (58 rows), showing the relative position of the value that i select, in a scale (e.g.: 58 cities and the position of the population size of one city relative to others).

I included an example of what I imagined to do

Here's a code sample showing my data structure (nregs the name of regions I'm studying). I want to create a 'rank plot' as I've showed for each row, one plot ranking based in total_pop and other based in urban_pop.

structure(list(nregs = c("1.1 Javari e Interbacias Javari - Juruá", 
"1.2 Transf. da Margem Esquerda do Solimões", "1.3 Juruá e Interbacias Juruá - Jutaí", 
"1.4 Purus e Interbacias Purus - Juruá", "1.5 Negro", "1.6 Madeira e Interbacias Madeira - Purus", 
"1.7 Estaduais Margem Esquerda do Amazonas", "1.8 Tapajós e Interbacias Tapajós - Madeira", 
"1.9 Estaduais PA", "1.10 Xingu e Interbacias Xingu - Tapajós"
), urban_pop = c(63777, 83237, 265725, 717181, 2122424, 1693933, 
837519, 1169865, 171045, 515124), total_pop = c(111120, 141473, 
405955, 910484, 2357696, 2320307, 933181, 1639624, 304181, 831595
)), row.names = c(NA, -10L), class = c("tbl_df", "tbl", "data.frame"
))

As english is not my native language, i'm finding it difficult to even search a solution online. I usually do my dataviz with R and tidyverse. Can anybody give me at least a direction? Thanks in advance.


Solution

  • It sounds like you're looking for something like this:

    library(ggplot2)
    library(dplyr)
    
    df %>% 
      mutate(urban_pop = rank(urban_pop),
             total_pop = rank(total_pop)) %>%
      tidyr::pivot_longer(-1) %>%
      ggplot(aes(value, nregs)) +
      geom_segment(aes(x = 1, y = nregs, xend = 10, yend = nregs)) +
      geom_segment(data = expand.grid(x = seq(nrow(df)), y = seq(nrow(df)) - 0.1),
                   aes(x = x, y = y, xend = x, yend = y + 0.2)) +
      scale_x_continuous(breaks = seq(nrow(df)), labels = rev(seq(nrow(df))),
                         name = "Rank") +
      geom_point(aes(color = name), position = position_dodge(width = 0.5),
                 size = 4) +
      scale_color_manual(values = c("red", "forestgreen")) +
      theme_void() +
      theme(axis.text.y = element_text(hjust = 1),
            axis.text.x = element_text(),
            axis.title.x = element_text(size = 16))
    

    enter image description here

    Note that the ranks of urban and total population appear to be the same for each city in your sample