I'm trying to clean a list of strings by finding strings with a particular pattern, but do not know how to write the regex to find them.
I am using grepl(), but do not know how to define the pattern.
The pattern is digits then [must include x, maybe special characters, letter] then digits again.
Here are some examples: OUTPUT from grepl()
"kills kld ldks 2087x-2714" TRUE
"sdlsn dklsk 4.75x25" TRUE
"dkks klsdk 3x4x135" TRUE
"djnlsdkl250shd" FALSE
"kdls, skfndkl 24gx.75" TRUE
"ski lsdkcm lskd 12.6" FALSE
"klslc ksldml 3.0 dnjsl 67n030" FALSE
It's a little bit of a complicated pattern. Basically it must include digits on both sides of the x, but can also have special characters and numbers in the mix.
Using str_detect
from the stringr package. I've added two additional test strings at the end of x.
The pattern is: a digit, zero or 1 occurrence of something that isn't a space, an x, zero or 1 occurrence of something that isn't a space, a digit
x <- c("kills kld ldks 2087x-2714",
"sdlsn dklsk 4.75x25",
"dkks klsdk 3x4x135",
"djnlsdkl250shd",
"kdls, skfndkl 24gx.75",
"ski lsdkcm lskd 12.6",
"klslc ksldml 3.0 dnjsl 67n030",
"5x25",
"kdls skfndkl x24g.75")
str_detect(x, "\\d\\S?x\\S?\\d")
#[1] TRUE TRUE TRUE FALSE TRUE FALSE FALSE TRUE FALSE