I have seen a lot of posts regarding how to extract the first number in a numeric variable or the last using functions like gsub or grep, however I want to be able to extract a specific digit regardless of whether it is the first, middle, or last in a larger numeric variable. For example, I am trying to have R scan if a certain row for a column has the number 3 and if so make a new variable where 1=yes and 0=no.
Let's say I have this dataframe:
have <- as.data.frame(structure(list(Q14=structure(c(13, 3, 788, 134, 56, 3214, 1036 )))))
This is the second column that I want to generate, where a 1 for variable Q14_3 means that variable Q14 has a 3 somewhere and 0 means there is no number 3 in a specific row of Q14.
want <- as.data.frame(structure(list(Q14=structure(c(13, 3, 788, 134, 56, 3214, 1036 )),
Q14_3=structure(c(1, 1, 0, 1, 0, 1, 1)))))
Thank you!
In base R, use grepl
to make a boolean vector and +
to convert this to a 1/0 variable:
have$Q14_3 <- +grepl(3, have$Q14)
# Q14 Q14_3
# 1 13 1
# 2 3 1
# 3 788 0
# 4 134 1
# 5 56 0
# 6 3214 1
# 7 1036 1
Or, since it was tagged, a tidyverse
approach using dplyr::mutate
and stringr::str_detect
:
library(dplyr)
library(stringr)
have %>%
mutate(Q14_3 = +str_detect(Q14, "3"))
Test:
all.equal(have, want)
# TRUE