Search code examples
ggplot2plotbar-chartgeom-bar

Connect observations (dots and lines) without using ggpaired


I created a bar chart using geom_bar with "Group" on the x-axis (Female, Male), and "Values" on the y-axis. Group is further subdivided into "Session" such that there is "Session 1" and "Session 2" for both Male and Female (i.e. four bars in total).

Since all participants participated in Session 1 and 2, I overlayed a dotplot (geom_dot) over each of the four bars, to represent the individual data.

I am now trying to connect the observations for all participants ("PID"), between session 1 and 2. In other words, there should be lines connecting several sets of two-points on the "Male" portion of the x-axis (i.e. per participant), and "Female portion".

I tried this with "geom_line" (below) but to no avail (instead, it created a single vertical line in the middle of "Male" and another in the middle of "Female"). I'm not too sure how to fix this.

See code below:

ggplot(data_foo, aes(x=factor(Group),y=Values, colour = factor(Session), fill = factor(Session))) + 
          geom_bar(stat = "summary", fun.y = "mean", position = "dodge") + 
          geom_dotplot(binaxis = "y", stackdir = "center", dotsize = 1.0, position = "dodge", fill = "black") +
          geom_line(aes(group = PID), colour="dark grey") +
          labs(title='My Data',x='Group',y='Values') +
          theme_light() 

Sample data (.txt)

data_foo <- readr::read_csv("PID,Group,Session,Values
P1,F,1,14
P2,F,1,13
P3,F,1,16
P4,M,1,18
P5,F,1,20
P6,M,1,27
P7,M,1,19
P8,M,1,11
P9,F,1,28
P10,F,1,20
P11,F,1,24
P12,M,1,10
P1,F,2,26
P2,F,2,21
P3,F,2,19
P4,M,2,13
P5,F,2,26
P6,M,2,15
P7,M,2,23
P8,M,2,23
P9,F,2,30
P10,F,2,21
P11,F,2,11
P12,M,2,19")

Solution

  • The trouble you have is that you want to dodge by several groups. Your geom_line does not know how to split the Group variable by session. Here are two ways to address this problem. Method 1 is probably the most "ggploty way", and a neat way of adding another grouping without making the visualisation too overcrowded. for method 2 you need to change your x variable

    1) Use facet

    2) Use interaction to split session for each Group. Define levels for the right bar order

    I have also used geom_point instead, because geom_dot is more a specific type of histogram. I would generally recommend to use boxplots for such plots of values like that, because bars are more appropriate for specific measures such as counts.

    Method 1: Facets

    library(ggplot2)
    ggplot(data_foo, aes(x = Session, y = Values, fill = as.character(Session))) +
      geom_bar(stat = "summary", fun.y = "mean", position = "dodge") + 
      geom_line(aes(group = PID)) +
      geom_point(aes(group = PID), shape = 21, color = 'black') +
      facet_wrap(~Group)
    

    Created on 2020-01-20 by the reprex package (v0.3.0)

    Method 2: create an interaction term in your x variable. note that you need to order the factor levels manually.

    data_foo <- data_foo %>% mutate(new_x = factor(interaction(Group,Session), levels = c('F.1','F.2','M.1','M.2')))
    
    ggplot(data_foo, aes(x = new_x, y = Values, fill = as.character(Session))) + 
      geom_bar(stat = "summary", fun.y = "mean", position = "dodge") +
      geom_line(aes(group = PID)) +
      geom_point(aes(group = PID), shape = 21, color = 'black') 
    

    Created on 2020-01-20 by the reprex package (v0.3.0)

    But everything gets visually not very compelling.