I have a column like this:
data.frame(x = c("ABC1","ABD1","ABE1","ABF1","ABG1","ABC2","ABC2","ABF2","ABE2"))
I want to find out how many unique observations there are which contain "AB" and a letter. So ABC1 and ABC2 are not unique but ABC1 and ABD1 are.
In this example, there would be 5 unique observations.
You can select only the first 3 characters for each word. Then count the number of unique occurrences.
df = data.frame(x = c("ABC1","ABD1","ABE1","ABF1","ABG1","ABC2","ABC2","ABF2","ABE2"),stringsAsFactors = FALSE)
length(unique(substr(df$x,1,3)))
5