Search code examples
rn-gramrweka

Error in using NGramTokenize (lapply issue)


I'm using the NGramTokenizer from the rWeka package. I believe i have installed everything correctly. I'm executing the following code:

Bigram_Tokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 2, max = 2))
tdm <- TermDocumentMatrix(corpus, control = list(tokenize= Bigram_Tokenizer()))

Error I receive is:

Error in lapply(x,f): argument "x" is missing with no default.

Any ideas on how to resolve this? Thanks again in advance.

Best

Vishal


Solution

  • You seem to be executing Bigram_Tokenizer() in TermDocumentMatrix function instead of just passing it by reference. You need to just pass it withou executing:

    Bigram_Tokenizer <- function(x) NGramTokenizer(x, Weka_control(min = 2, max = 2))
    tdm <- TermDocumentMatrix(corpus, control = list(tokenize= Bigram_Tokenizer))