I'm trying to show how changes in the proportions of different age classes have changed through time. Presently, for all the data, I have the proportion of individuals at a given age (pAge), and cumulative sums of the proportions that provide the min/max values for each age (cpAgeMin and cpAgeMax) that I'm using in geom_ribbon.
Unfortunately, even though all the proportions sum to 1, my plot doesn't seem to properly reflect the data as there are open spots without data in several years. These don't seem to be issues with the data, but rather something ggplot is doing. How do I fix this so that the entire area from 0-1 in every year is assigned to one age class.
# subset of data
df <- structure(list(site = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L),
levels = "G", class = "factor"),
year = c(1988, 1988, 1989, 1989, 1989, 1990, 1990, 1991, 1991, 1991,
1992, 1992, 1992, 1993, 1993, 1993, 1998, 1998, 1998, 1999,
1999, 1999, 2000, 2000, 2000, 2001, 2001, 2001, 2002, 2002,
2003, 2003, 2003),
age = c(`9104` = 1, `9341` = 2, `9292` = 1, `9632` = 2, `9960` = 4,
`9543` = 1, `9857` = 3, `9685` = 1, `9968` = 2, `10169` = 3,
`9858` = 1, `10127` = 2, `10212` = 3, `10009` = 1, `10284` = 2,
`10522` = 6, `10464` = 1, `10605` = 2, `10830` = 3, `10556` = 1,
`10744` = 2, `11056` = 3, `10687` = 1, `11061` = 2, `11279` = 5,
`10912` = 1, `11109` = 2, `11197` = 3, `11065` = 1, `11194` = 2,
`11154` = 1, `11255` = 2, `11324` = 3),
pAGE = c(0.972, 0.028, 0.964, 0.005, 0.031, 0.823, 0.177, 0.921, 0.074,
0.004, 0.846, 0.045, 0.109, 0.833, 0.155, 0.012, 0.927, 0.054,
0.019, 0.784, 0.21, 0.005, 0.958, 0.017, 0.025, 0.852, 0.124,
0.024, 0.913, 0.087, 0.909, 0.073, 0.019),
cpAgeMin = c(0, 0.972, 0, 0.964, 0.969, 0, 0.823, 0, 0.921, 0.996, 0,
0.846, 0.891, 0, 0.833, 0.988, 0, 0.927, 0.981, 0, 0.784,
0.995, 0, 0.958, 0.975, 0, 0.852, 0.976, 0, 0.913, 0, 0.909,
0.981),
cpAgeMax = c(0.972, 1, 0.964, 0.969, 1, 0.823, 1, 0.921, 0.996, 1, 0.846,
0.891, 1, 0.833, 0.988, 1, 0.927, 0.981, 1, 0.784, 0.995, 1,
0.958, 0.975, 1, 0.852, 0.976, 1, 0.913, 1, 0.909, 0.981, 1),
seg = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2,
2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2)),
row.names = c(NA, -33L), class = c("tbl_df", "tbl", "data.frame"))
# plot
ggplot(data = df) +
geom_ribbon(aes(x = year, ymin = cpAgeMin, ymax = cpAgeMax, group = interaction(seg, factor(age)), fill = factor(age), col = factor(age))) +
theme_bw()
UPDATE:
After implementing @Andy Baxter's solution to a second site, there are extra "bumps" outside of the range of the data that appear in the plot. How do I get rid of these?
Stacking it with geom_area
and position = "fill"
might be smoothest:
library(tidyverse)
df |>
ggplot() +
geom_area(
aes(
x = year,
y = pAGE,
group = interaction(seg, factor(age)),
fill = factor(age)
),
position = position_fill(reverse = TRUE) # to get right order
)