Search code examples
rggplot2errorbar

Making a depth profile with multiple variables on the x axis, including error bars


I've got a copy of a figure that needs error bars added to it. My data is coming from one CSV file and I've tried separating this into 4 distinct, workable sections for 4 depth profiles (ligand and logK; station 5 and station 9). The following code is for one of these profiles in hopes that I'll be able to recreate this solution 3 more times. Right now, my code looks like this, where I've read in a CSV file, transformed it from wide to long, and plotted using the ggplot function. However, I'm having trouble visualizing how to add horizontal error bars without ggplot thinking the columns for error bars are actual points I want to plot on the graph. I have a feeling it has something to do with my data wrangling in the beginning, but I'm not sure what. (Note: this is my first post here, so if it's not an actual reprex, please let me know!! I tried my best to be clear, but if it's not I will try and amend).

What I have so far...without error bars

Note: the actual plot I have has many of the aesthetics adjusted as well, but to try and cut down on code, I left those lines out.

Figure with error bar data plotted as points on the graph This figure has an adjusted line for gather() function where the error columns are included as a part of the measurement column of the long data; listed below for reference

station5_L1_long <- gather(station5_L1, col_names, measurement, dFe:L1_diff_from_mean, factor_key=TRUE)

Figure with adjusted aesthetics

library(ggplot2) #using the ggplot package to plot
library(magrittr) #using the magrittr package to pipe
library(tidyr) #using the tidyr package to convert between wide and long data forms

dput(ligand_data[1:10, ]) #ligand data frame including both station 5 and 9

#dput() result
structure(list(Station = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 
1L, 1L), .Label = c("5", "9"), class = "factor"), Depth = c(2L, 
2700L, 3000L, 30L, 3300L, 3600L, 3900L, 4200L), dFe = c(0.31, 
0.65, 0.66, 0.3, 0.65, 0.62, 0.61, 0.61), L1ship_nM = c(1.265, 
1.46, NA, 1.365, NA, NA, 1.33, NA), L1lab_nM = c(1.32, 1.93, 
1.92, 1.35, 2.23, 1.99, 1.8, 2.4), L1A_nM = c(1.18, 1.37, NA, 
1.39, NA, NA, 1.36, NA), L1B_nM = c(1.35, 1.55, NA, 1.34, NA, 
NA, 1.3, NA), L1all_nM = c(1.283333333, 1.616666667, NA, 1.36, 
NA, NA, 1.486666667, NA), L1freeze2013_nM = c(1.52, NA, NA, NA, 
NA, NA, NA, NA), L1_allfreeze_nM = c(1.42, 1.93, NA, 1.35, NA, 
NA, 1.8, NA), L1_ALL_nM = c(1.3425, 1.616666667, NA, 1.36, NA, 
NA, 1.486666667, NA), L1shipSD_nM = c("0.120208153", "0.127279221", 
"", "0.035355339", "", "", "0.042426407", ""), L1allSD_nM = c(0.090737717, 
0.285890422, NA, 0.026457513, NA, NA, 0.273007936, NA), L1_allfreezeSD_nM = c(0.141421356, 
NA, NA, NA, NA, NA, NA, NA), L1_ALL_SD_nM = c(0.139612559, 0.285890422, 
NA, 0.026457513, NA, NA, 0.273007936, NA)), row.names = c(NA, 
8L), class = "data.frame")

##################### CLEANING DATA ###########################
ligand_data <- merge(base3_cols, ligand, by.x = 0, by.y = 0, all.x = TRUE) %>%
  select(-Row.names) 

#filtering for depth profiles
station5_L1 <- ligand_data %>% filter(Station == 5) #filtering ligand df to include just station 5
#changing from wide to long
station5_L1_long <- gather(station5_L1, col_names, measurement, dFe:L1_diff_from_mean, factor_key=TRUE) 

################################ PLOTS w/ggplot ##################################
station5L1_depth_profile <- ggplot(data = station5_L1_long,  
                                        aes(color = col_names, 
                                            shape = col_names,
                                            fill = col_names,
                                            size = 0.25)
) + 
  geom_point(mapping = aes(
    x = as.numeric(measurement),
    y = as.numeric(Depth),
    size = 0.25
  )) +
  scale_y_reverse() + 
  guides(size = FALSE) + 
  scale_x_continuous(position = "top", breaks = scales::breaks_width(0.5)) + #moves x-axis to top 
  expand_limits(x = c(0, 2.5)) + 
  scale_shape_manual(values=c(4, 21, 21, 23, 22, 25, 24, 13, 5))+
  scale_fill_manual(#labels = c("dFe", "A", "B", "ship", "lab", "2013", "all pre-2013", "all"),
                    values=c("#000000", "#74ADD1", "#D73027", "#4575B4", "#ABD9E9", 
                             "#E0F3F8", "#F46D43", "#313695", "#000001")) +
  scale_color_manual(values=c("#000000", "#000000", "#000000", "#000000", "#000000", 
                            "#000000", "#000000", "#000000", "#000000"))
station5L1_depth_profile 

Solution

  • hope I understand your problem correctly. Here would be my solutions. First I split the data again into two dataframes, as it easier to handle the pivoting. Next up I pivot them into long format and prepare them for joining.

    After joining I can plot it and calculate a line range.

    Please try to also explain your dataset next time, as its hard do unterstand without any further information.

    library(tidyverse)
    
    # make everything numeric
    # as far as I can see this makes sense
    df <- df %>%
      mutate(
        across(everything(), as.numeric)
      )
    # For easier manipulating we split the df
    main_df <- df %>% select(Station:L1_ALL_nM)
    sd_df   <- df %>% select(Station:Depth, L1shipSD_nM:L1_ALL_SD_nM)
    # now we pivot longer
    main_df <- main_df %>% 
      pivot_longer(cols = dFe:L1_ALL_nM, names_to = "col_names", values_to = "val")
    sd_df <- sd_df %>%
      pivot_longer(cols = L1shipSD_nM:L1_ALL_SD_nM, names_to = "col_names", values_to = "sd") %>% 
      mutate(
        # remove SD from string, we dont need it
        col_names = str_replace_all(col_names, "SD", "")
      )
    # join the tables
    plot_df <- main_df %>% full_join(sd_df)
    
    # Plot our result
    plot_df %>% 
      ggplot(
        aes(y = Depth, x = val, color = col_names, shape = col_names, fill = col_names)
      ) +
      geom_pointrange(
        aes(xmin = val - sd, xmax = val + sd)
        )
    
    

    enter image description here