Search code examples
rdplyr

Calculating population estimate on multiple species in same table


I am trying to write a code that will allow me to automate the calculation of a population estimate on summarized capture data based on two-pass depletion methods used in fisheries.

The equation to calculate a population estimate using this method is as follows:

N = (C1)^2/(C1-C2) where N = Population Estimate, C1= Number captured in 1st sampling, C2 = Number captured in second sampling.

I have already summarized a large dataset based on species and the number captured in each "pass" or sampling instance.

An example of this summary dataset looks like this:

dataex <- data.frame(Species= c("BRK","RGN","RGN","RBT","RBT"),
                  PassCaptured = c(1,1,2,1,2),
                  NumberCaptured = c(6,18,5,10,3))

I am having trouble figuring out a code to calculate the population estimate for each species in this summary dataset.

For example, the population estimate for the species RGN using the dataset and the equation would be: (18)^2/(18 - 5) giving N = 24.

I could calculate this manually for each species based on the summarized dataset, but I know there is a more efficient way.

The desired output is a table with population estimate (N) calculated for each species:

data.frame(Species = c("RGN",'RBT'),
           Popest = c(18,35))

Note also that there are cases when a species is captured in the first sampling instance and not the second and thus, a population estimate cannot be calculated (species BRK in the example dataset)

Any help getting me started would be greatly appreciated.


Solution

  • Here's one way to calculate this population estimate for each species.

    library(dplyr)
    
    dataex |>
      arrange(Species, PassCaptured) |>
      summarize(
        Popest = (NumberCaptured[1]^2)/(NumberCaptured[1] - NumberCaptured[2]), 
        .by = Species
      ) |>
      tidyr::drop_na()
    #>   Species   Popest
    #> 1     RBT 14.28571
    #> 2     RGN 24.92308