Search code examples
rstringcountsumword-count

Count occurrences of specific words from a dataframe row in R


I have a Dataset with 2 columns and multiple rows. first column ID, second column the text which belongs to it.

I want to add more columns which sums up how many times a certain string appears in the text from the Row. the string would be "\n Positive\n", "\n Neutral\n", "\n Negativ\n"`

Example of the Dataset:

Id, Content
2356, I like cheese.\n  Positive\nI don't want to be here.\n Negative\n
3456, I am alone.\n Neutral\n

At the End it should look like

Id, Content,Positiv, Neutral, Negativ
2356, I like cheese.\n  Positive\nI don't want to be here.\n Negative\n,1 ,0 ,1
3456, I am alone.\n Neutral\n, 0, 1, 0

Right now i tried it like this but it isn't giving the right answers:

getCount1 <- function(data, keyword)
{
Positive <- str_count(Dataset$CONTENT, keyword)
return(data.frame(data,Positive))
}
Stufe1 <-getCount1(Dataset,'\n Positive\n')
################################################################
getCount2 <- function(data,  keyword)
{
Neutral <- str_count(Stufe1$CONTENT, keyword)
return(data.frame(data,Neutral))
}
Stufe2 <-getCount2(Stufe1,'\n  Neutral\n')
#####################################################
getCount3 <- function(data,  keyword)
{
Negative <- str_count(Stufe2$CONTENT, keyword)
return(data.frame(data,Negative))
}
Stufe3 <-getCount3(Stufe2,'\n  Negative\n')

Solution

  • I Assume this is what you require

    Sample data

    id <- c(1:4)
    text <- c('I have a Dataset with 2 columns a',
              'nd multiple rows. first column ID', 'second column the text which',
              'n the text which belongs to it.')
    dataset <- data.frame(id,text)
    

    Function to find count

    library(stringr)
    getCount <- function(data,keyword)
    {
      wcount <- str_count(dataset$text, keyword)
      return(data.frame(data,wcount))
    }
    

    Calling getCount should give the updated dataset

    > getCount(dataset,'second')
      id                              text wcount
      1   I have a Dataset with 2 columns a      0
      2   nd multiple rows. first column ID      0
      3        second column the text which      1
      4     n the text which belongs to it.      0