Search code examples
rsubsetletters

Subset rows only contain letters in R


My vector have around 3000 observations like:

clients <- c("Greg Smith", "John Coolman", "Mr. Brown", "John Nightsmith (father)", "2 Nicolas Cage")

How I can subset rows that contain only names with letters. For example, only Greg Smith, John Coolman (without symbols like 0-9,.?:[} etc.).


Solution

  • We can use grep to match only upper or lower case alphabets along with space from start (^) to end ($) of the string.

    grep('^[A-Za-z ]+$', clients, value = TRUE)
    #[1] "Greg Smith"   "John Coolman"
    

    Or just use the [[:alpha:] ]+

    grep('^[[:alpha:] ]+$', clients, value = TRUE)
    #[1] "Greg Smith"   "John Coolman"