I have a df with a variable containing multiple charactere as unit and value like below
[525] "8 µg/ml"
[526] "16 µg/ml - 32 µg/ml - 200 µg/ml - 500 µg/ml - 1000 µg/ml"
[527] "5 µg/ml - 10 µg/ml - 250 µg/ml"
[528] "20 µg/ml"
[529] "16 µg/ml"
[530] "60 µg/ml"
I would like to extract two values (min and max) from this variable in two different other variables When only one value is available i would like to implemente min by default I have tried to used str_extracted but i'm sur you will have more valuable advice or solutions Thanks to all of you for your help Best
You can extract all the numbers from the string using str_extract_all
and then return min and max value using range
.
mat <- t(sapply(stringr::str_extract_all(x, '\\d+'), function(x)
range(as.numeric(x))))
mat[mat[, 1] == mat[, 2], 2] <- NA
mat
# [,1] [,2]
#[1,] 8 NA
#[2,] 16 1000
#[3,] 5 250
#[4,] 20 NA
#[5,] 16 NA
#[6,] 60 NA
data
x <- c("8 µg/ml", "16 µg/ml - 32 µg/ml - 200 µg/ml - 500 µg/ml - 1000 µg/ml",
"5 µg/ml - 10 µg/ml - 250 µg/ml", "20 µg/ml", "16 µg/ml", "60 µg/ml")