I want to test if the one variable has an effect on the catch rates of different species that are grouped, but am having trouble understanding how to do this in a tidy and succinct manner. I have a dataset with about 400 catch rates, but there is a LOT of variability among species' catch rates. It looks like this:
n <- 100
df<- data.frame(organization=rep(LETTERS[1:4], n/2),
species=rep(c("shark", "whale", "fish", "ray", "turtle"), each=20) ,
gear=rep(c("l", "p", "l", "p", "l", "p", "l", "p", "l", "p"), each =10),
What I have tried so far is:
df %>%
group_by(species, gear) %>%
do(tidy(lm(rate~organization, data=.))) %>%
mutate(p.value=round(p.value, 3)) %>%
filter(p.value<0.05)#filter only sig. pvals
What I want to know is whether there is a simple and more elegant way to test the effect of ONLY organization, but while still grouping species and gear. Essentially species and gear have a big effect, and different species can't really be compared against one another. So I want to know if WITHIN the same species and gear, organization makes a difference.
Any help will be so appreciated!!
This is a start. Not the complete solution. As here we group only with species
. You can first group by species
after that by gear
and then combine both group_by(species, gear)
df %>%
mutate(species = as_factor(species)) %>%
group_by(species) %>%
group_split() %>%
map_dfr(.f = function(df) {
lm(rate ~ organization, data = df) %>%
glance() %>%
add_column(species = unique(df$species), .before = 1)
species r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual nobs
<fct> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <int> <int>
1 shark 0.229 0.165 1.18 3.57 0.0234 3 -61.4 133. 141. 50.5 36 40
2 whale 0.192 0.124 1.03 2.84 0.0513 3 -55.6 121. 130. 37.8 36 40
3 fish 0.0980 0.0229 0.999 1.30 0.288 3 -54.6 119. 128. 35.9 36 40
4 ray 0.121 0.0481 0.783 1.66 0.194 3 -44.9 99.7 108. 22.1 36 40
5 turtle 0.0448 -0.0348 0.922 0.563 0.643 3 -51.4 113. 121. 30.6 36 40