Search code examples
rggplot2

R help - ggplot - can only get position of bars right or colours, not both


I am making a plot where I have an above ground plant measurement and a below ground measurement for a plant's growth. I want to display the above ground measurement going up from 0 on the y axis and the below ground measurement going down from 0 on the y axis (I still want this to be positive in value). I have three treatment groups and two sample groups and I want to have the treatments displayed on the x axis. I am able to get almost all the way there with the code below.

However -

  1. I would like the bars for each sample to meet at the x axis, currently all bars are dodged. When I remove position = position_dodge() from geom_bar() the sample bars stack. What can I do to get each sample bar to meet in the middle. When I change ggplot(aes(fill = Sample) I get the structure of graph I want but I cannot get the bars to be the right colours.

  2. The below 0 y-axis labels - is there a way that I can mask these so that even though the underlaying numbers are negative I can have positive numbering on the axis? I have tried to do this with instead of labels = scales::label_number(accuracy = 1) I have labels = function(y) ifelse(y < 0, abs(y), y) which works to have the numbers positive on both sides of 0, however they become very long decimals. I cannot figure out a way to use both of these to have both simple numbering and for the below 0 numbers to be positive.

Here is the code so far:

# Load libraries
library(ggplot2)
library(dplyr)
library(scales)

# Sample data with 2 samples and 3 treatments
data <- data.frame(
  Sample = rep(c("Sample1", "Sample2"), each = 3),
  Treatment = rep(c("Treatment1", "Treatment2", "Treatment3"), times = 2),
  ShootLength = c(10, 12, 11, 9, 13, 14),
  RootLength = c(7, 6, 8, 6, 7, 9)
)

# Add random variation to simulate repeated measurements
set.seed(123)
data_long <- data.frame(
  Sample = rep(data$Sample, each = 6),
  Treatment = rep(data$Treatment, each = 6),
  Type = rep(c("ShootLength", "RootLength"), times = 18),
  Length = c(
    rnorm(18, mean = rep(data$ShootLength, each = 3), sd = 1),
    rnorm(18, mean = rep(data$RootLength, each = 3), sd = 1)
  )
)


# Summarize data for mean and standard deviation
data_summary <- data_long %>%
  group_by(Sample, Treatment, Type) %>%
  summarise(
    Mean = mean(Length),
    SD = sd(Length),
    .groups = "drop"
  )

# Mutate so RootLength is negative to appear below x axis
data_summary <- data_summary %>%
  mutate(
    StackedOffset = ifelse(Type == "RootLength", -Mean, Mean), # Adjust for below-ground
    SD_Pos = SD,                                                 # Standard deviation remains positive
    FillGroup = paste(Sample, Type, sep = "_")                   # New variable for combined colors
  )

# Plot
ggplot(data_summary, aes(x = Treatment, y = StackedOffset, fill = FillGroup)) +
  geom_bar(stat = "identity", position = position_dodge(), width = 0.5, color = "black") +
  geom_errorbar(
    aes(
      ymin = StackedOffset - SD_Pos,
      ymax = StackedOffset + SD_Pos
    ),
    position = position_dodge(0.5), width = 0.2, color = "black"
  ) +
  scale_y_continuous(
    name = "Shoot Length (Above Ground)",
    breaks = seq(-max(data_summary$Mean + data_summary$SD), max(data_summary$Mean + data_summary$SD), by = 1),
    labels = scales::label_number(accuracy = 1),
    sec.axis = sec_axis(~ ., name = "Root Length (Below Ground)", labels = function(y) abs(y))
  ) +
  scale_fill_manual(
    values = c(
      "Sample1_ShootLength" = "red",         # Sample1 above ground
      "Sample1_RootLength" = "blue",        # Sample1 below ground
      "Sample2_ShootLength" = "darkred", # Sample2 above ground
      "Sample2_RootLength" = "darkblue" # Sample2 below ground
    )
  ) +
  theme_minimal() +
  theme(
    axis.text.x = element_text(angle = 45, hjust = 1),
    panel.grid.major.y = element_line(color = "gray", size = 0.5),
    panel.grid.major.x = element_blank()
  ) +
  labs(
    x = "Treatment",
    y = "Length",
    fill = "Sample & Type",
    title = "Shoot and Root Lengths by Treatment and Sample"
  )


Solution

  • To achieve your desired result you have to (explicitly) map Sample on the group aes. The group aes is what determines how bars get dodged or stacked or ... and by default is set according to all discrete or categorical variables mapped on an aesthetic, i.e. in your case the variable mapped on fill is under the hood also mapped on group.

    And to get positive numbers simply use the absolute value as the label before applying scales::label_number.

    Finally, I use the limits= argument to get an identical range on either side of the zero line.

    library(ggplot2)
    
    ggplot(data_summary, aes(
      x = Treatment, y = StackedOffset,
      fill = FillGroup, group = Sample
    )) +
      geom_bar(
        stat = "identity", position = position_dodge(),
        width = 0.5, color = "black"
      ) +
      geom_errorbar(
        aes(
          ymin = StackedOffset - SD_Pos,
          ymax = StackedOffset + SD_Pos
        ),
        position = position_dodge(0.5), width = 0.2, color = "black"
      ) +
      scale_y_continuous(
        name = "Shoot Length (Above Ground)",
        breaks = scales::breaks_width(1),
        limits = max(data_summary$Mean + data_summary$SD) * c(-1, 1),
        labels = ~ scales::label_number(accuracy = 1)(abs(.x)),
        sec.axis = sec_axis(~.,
          name = "Root Length (Below Ground)",
          labels = function(y) abs(y)
        )
      ) +
      scale_fill_manual(
        values = c(
          "Sample1_ShootLength" = "red", # Sample1 above ground
          "Sample1_RootLength" = "blue", # Sample1 below ground
          "Sample2_ShootLength" = "darkred", # Sample2 above ground
          "Sample2_RootLength" = "darkblue" # Sample2 below ground
        )
      ) +
      theme_minimal() +
      theme(
        axis.text.x = element_text(angle = 45, hjust = 1),
        panel.grid.major.y = element_line(color = "gray", size = 0.5),
        panel.grid.major.x = element_blank()
      ) +
      labs(
        x = "Treatment",
        y = "Length",
        fill = "Sample & Type",
        title = "Shoot and Root Lengths by Treatment and Sample"
      )
    

    enter image description here