I have a data frame containing fish population sampling data. I would like to create bins to count how many fish are in a given length group for each species. The below code accomplishes this task for 2 species. Doing this for all species in the data frame doesn't seem like the most elegant way to achieve this goal.
Plus I would like to apply this code to other lakes with different species. It would be great to find an "automated" way to apply these bins to each species group in the data frame.
The data frame looks like:
Species TL WT
BLG 75 6
BLG 118 27
LMB 200 98
LMB 315 369
RBS 112 23
RES 165 73
SPB 376 725
YEP 155 33
ss = read.csv("SS_West Point.csv" , na.strings="." , header=T)
blg = ss %>% subset(Species == "BLG")
lmb = ss %>% subset(Species == "LMB")
blgn = blg %>% summarise(n = n())
lmbn = lmb %>% summarise(n = n())
### 20mm Length Groups - BLG ###
blg20 = blg %>% group_by(gr=cut(TL , breaks = seq(0 , 1000 , by = 20))) %>%
summarise(n = n()) %>% mutate(freq = n , percent = ((n/blgn$n)*100) ,
cumfreq = cumsum(freq) , cumpercent = cumsum(percent))
### 20mm Length Groups - BLG ###
lmb20 = lmb %>% group_by(gr=cut(TL , breaks = seq(0 , 1000 , by = 20))) %>%
summarise(n = n()) %>% mutate(freq = n , percent = ((n/lmbn$n)*100) ,
cumfreq = cumsum(freq) , cumpercent = cumsum(percent))
I've successfully used do() to run linear models on this data frame but can't seem to get it to work on cut(). Here is how I used do() on lm():
ssl = ss %>% mutate(lTL = log10(TL) , lWT = log10(WT)) %>% group_by(Species)
m = ssl %>% do(lm(lWT~lTL , data =.)) %>% mutate(wp = 10^(.fitted))
Does this do what you expect?
ss20 <- ss %>%
add_count(Species) %>%
rename(Species_count = n) %>%
# I added Species_count to the grouping so it goes along for the ride in summarization
group_by(Species, Species_count, gr=cut(TL , breaks = seq(0 , 1000 , by = 20))) %>%
summarise(n = n()) %>%
mutate(freq = n, percent = ((n/Species_count)*100),
cumfreq = cumsum(freq) , cumpercent = cumsum(percent)) %>%
ungroup()
> ss20
# A tibble: 8 x 8
Species Species_count gr n freq percent cumfreq cumpercent
<chr> <int> <fct> <int> <int> <dbl> <int> <dbl>
1 BLG 2 (60,80] 1 1 50 1 50
2 BLG 2 (100,120] 1 1 50 2 100
3 LMB 2 (180,200] 1 1 50 1 50
4 LMB 2 (300,320] 1 1 50 2 100
5 RBS 1 (100,120] 1 1 100 1 100
6 RES 1 (160,180] 1 1 100 1 100
7 SPB 1 (360,380] 1 1 100 1 100
8 YEP 1 (140,160] 1 1 100 1 100