I need your help for a small problem. For a master's project, I need to plot the frequency of a behavior in a bird species according to the age in days. Unfortunately I can't provide all the data because it's confidential but I can give you the example I'm trying to make:
What I need to do is see if for a given age, a behavior will be more frequent, and plot it.
I've tried several different methods, like this one I found in the site: First I tried to calculate the frequency of behaviors for every age:
df %>%
group_by(agesincetaggingdays, behaviors) %>%
summarise(n = n()) %>%
mutate(freq = n / sum(n))
This gave me:
agesincetaggingdays behaviors n freq
<dbl> <chr> <int> <dbl>
1 0 Active 5 0.000410
2 0 Feeding 49 0.0724
Basically the outpout gives me the frequencies of the behaviors for each age in all individuals.
Now I want to know how I can extract these frequencies and do a plot with it, for each behavior. I take again my previous example:
If I want to see how active the birds are according to their age, I would have to extract all the frequencies of the Active behavior across all ages and then plot on the y axis the frequency of the behavior and on the x axis the age.
Is there a way how to do that ? Don't hesitate if you want more precision.
Thank you !
There is no need to extract anything, if I understand you correctly - the code you used already gives you a dataframe which can be passed to ggplot()
to make a plot.
I created some dummy data to illustrate the workflow. Of course you might want to choose a plot type that suits your needs (I used a stacked area chart below).
library(tidyverse)
# create sample data
df <- tibble(
behavior = sample(letters[1:5], size = 1e6, replace = TRUE),
age = sample(0:1500, size = 1e6, replace = TRUE)
)
# compute shares
df <- df |>
count(age, behavior) |>
mutate(share = n / sum(n),
.by = age)
# plot
ggplot(df) +
geom_area(aes(age, share, fill = behavior))
Created on 2023-10-21 with reprex v2.0.2