Dear all I have a vector of strings like:
LOCAT01PE
WECAT013EJD
AFECAT0155DR
I want to subset each value obtain only CAT and all the number after:
CAT01
CAT013
CAT0155
I have tried to use the command substr
but it won't work since the quantity before the word CAT is not fixed and the numbers after CAT are not fixed.
We can use regexpr/regmatches
in base R
. It matches the word 'CAT' followed by -
if there is any ?
and one or more digits (\\d+
)
regmatches(x, regexpr("CAT-?\\d+", x))
#[1] "CAT01" "CAT013" "CAT0155" "CAT-01" "CAT-013" "CAT-0155"
x <- c('LOCAT01PE', 'WECAT013EJD', 'AFECAT0155DR',
'LO-CAT-01PE', 'WE-CAT-013-EJD', 'AFE-CAT-0155-DR')