Here is my dataset:
https://app.box.com/s/yotsy58ud2k9yk7vs7sj8ksc0favhevv
I'm trying to create a frequency table of the tags from a single column with following structure:
I tried using qdap
for simplicity, but the result is not correct
library(qdap)
tags_df <- read.csv(file.choose())
freq_terms(tags_df$tags)
Solution
Just improving (creating a data frame and sorting) the solution given by Rui:
sp <- unlist(strsplit(as.character(unlist(tags_df$tags)),'^c\\(|,|"|\\)'))
inx <- sapply(sp, function(y) nchar(trimws(y)) > 0 & !is.na(y))
data <- as_data_frame(table(tolower(sp[inx])))
data <- data[with(data,order(-n)),]
data <- data[1:10,]
If all you want or need is a frequency count, you can do without external packages, base R has a function table
.
sp <- unlist(strsplit(as.character(unlist(tags_df$tags)), '^c\\(|,|"|\\)'))
inx <- sapply(sp, function(y) nchar(trimws(y)) > 0 & !is.na(y))
table(sp[inx])
# Android CSS3 Design Hiring JavaScript NextJS
# 1 1 1 1 4 1
# NodeJS programming Programming ReactJS Testing UI
# 1 1 3 3 1 1
# UX WebDesign webdev WebDev
# 1 2 1 4
EDIT.
I have just realized that you have "programming"
and "Programming"
, "webdev"
and "WebDev"
as tags, maybe you want to do a case-insensitive count. If this is the case, try instead
table(tolower(sp[inx]))