I have a data frame that I am getting from a survey I made in Qualtrics where we identified the species of fish harvested and number harvested for each angler (ID = individual anglers). I am trying to wrangle the data so that I can build a harvest data frame with the survey data but maintain the survey ID.
Here is what my data looks like:
df<-structure(list(ID = 1:3, Q7 = c("Other species (please list species name; i.e. Tarpon),Other species (please list species name; i.e. Amberjack)",
"Red Drum (a.k.a. Redfish or Red),Other species (please list species name; i.e. Tarpon),Other species (please list species name; i.e. Amberjack)",
"Red Drum (a.k.a. Redfish or Red),Other species (please list species name; i.e. Tarpon)"
), Q7_7_TEXT = c("Tink", "Blue", "Blue"), Q7_8_TEXT = c("Chii",
"Red", NA), Q8_1 = c(NA, "2", "7"), Q8_7 = c("1", "4", "9"),
Q8_8 = c("2", "5", NA)), class = "data.frame", row.names = 3:5)
I am using the following code to pull out the species and harvest numbers for each species so that the information is housed in a data frame (harv
) with species and counts:
library(tidyverse)
library(stringi)
harv<-data.frame(spp=unlist(strsplit(df$Q7, ",")))
oth<-na.omit(data.frame(spp=stri_remove_empty(c(unlist(t(df[,3:4]))))))
idx <- grep('Other', harv$spp)
n <- min(length(idx), nrow(oth))
harv[idx, 'spp'] <- oth
harv.num<-na.omit(as.numeric(stri_remove_empty(c(unlist(t(df[,5:7]))))))
harv$num<-harv.num
harv
spp num
1 Tink 1
2 Chii 2
3 Red Drum (a.k.a. Redfish or Red) 2
4 Blue 4
5 Red 5
6 Red Drum (a.k.a. Redfish or Red) 7
7 Blue 9
How can extract this data from df
while also maintaining the survey ID from the original data frame so that harv
looks like this:
spp num ID
1 Tink 1 1
2 Chii 2 1
3 Red Drum (a.k.a. Redfish or Red) 2 2
4 Blue 4 2
5 Red 5 2
6 Red Drum (a.k.a. Redfish or Red) 7 3
7 Blue 9 3
You can achieve your desired result with pivot_longer
and some preliminary renaming of the column names like so:
library(dplyr, warn = FALSE)
library(tidyr)
df |>
rename(Q7_1_TEXT = Q7) |>
rename_with(~gsub("_TEXT$", "", .x), ends_with("TEXT")) |>
pivot_longer(-ID, names_to = c(".value", "name"), names_sep = "_") |>
filter(!is.na(Q8)) |>
select(ID, spp = Q7, num = Q8)
#> # A tibble: 7 × 3
#> ID spp num
#> <int> <chr> <chr>
#> 1 1 Tink 1
#> 2 1 Chii 2
#> 3 2 Red Drum (a.k.a. Redfish or Red),Other species (please list speci… 2
#> 4 2 Blue 4
#> 5 2 Red 5
#> 6 3 Red Drum (a.k.a. Redfish or Red),Other species (please list speci… 7
#> 7 3 Blue 9