I am interested in counting unique words that appear in a column. Rather than getting unique words per row as expained in Count unique/dinstinct words into a new column I'm interested in getting one answer which counts all unique entries in that column. In the following example the total unique countries are 3: China Australia and Korea
Is there a short way of getting this sum? I am still learning R therefore I have limited knowledge.
Countries
China Australia
Australia
China China
Korea Korea Korea Korea
We can split
the column 'Countries' by space, unlist
, and get the length
of unique
words
length(unique(unlist(strsplit(df1$Countries, " "))))
#[1] 3
Or using tidyverse
library(tidyverse)
df1 %>%
separate_rows(Countries) %>%
distinct() %>%
nrow
#[1] 3
df1 <- structure(list(Countries = c("China Australia", "Australia",
"China China", "Korea Korea Korea Korea")), .Names = "Countries",
class = "data.frame", row.names = c(NA, -4L))