I have a column of strings that look like the string below, where the numbers following the double colons "::"
are ages. In this example, 51, 40, 9, 5, 2, and 15 are the ages. The numbers following the "||"
are just saying this is the first person, second person, etc. I'd like to extract just the ages.
library(tidyverse)
ex_str = "0::51||1::40||2::9||3::5||4::2||5::15"
I've tried things like,
test_string |>
str_extract_all("::[0-9]+")
only to get the output below.
[[1]]
[1] "::51" "::40" "::9" "::5" "::2" "::15"
I apologize for the simple question. I've watched a few videos and read some guides online, but I just can't figure it out.
You can use str_extract_all
with a regex that includes a positive look-behind for '::'
:
library(tidyverse)
ex_str <- "0::51||1::40||2::9||3::5||4::2||5::15"
ages <- str_extract_all(ex_str, "(?<=::)\\d+") %>% unlist()
ages_numeric <- as.numeric(ages)
print(ages_numeric)
Output:
[1] 51 40 9 5 2 15