Search code examples
rtm

How to remove non associated words from results returned by findAssocs function from tm package


I am using the findAssocs function from tm package in R to find all the words which are associated with a given set of words. The results returned display some words which are not associated with any word. For example in the below output the word "new" is not associated with any word with minimum correlation of 0.7. So I want to remove all these words and create a vector of words which have some associations. In this case the vector would be c("blush") . So how can I accomplish this? Thanks

> findAssocs(myTdm,c("new","blush"),0.7)
$new
numeric(0)

$blush
  combination     customize     different       endless         flush     highlight        jdlxmd        master 
         0.98          0.98          0.98          0.98          0.98          0.98          0.98          0.98 
possibilities         three        unique           use 
         0.98          0.98          0.98          0.98  

Solution

  • You can use compact from package purrr with function lengths:

    findAssocsRes <-list(a=integer(0),b=c(x=1,y=2) ,c=c(z=1) )
    
    findAssocsRes
    $a
    integer(0)
    
    $b
    x y 
    1 2 
    
    $c
    z 
    1 
    
    purrr::compact(findAssocsRes,lengths)
    $b
    x y 
    1 2 
    
    $c
    z 
    1 
    

    in base R, you can also use lapply with length:

    findAssocsRes[lapply(findAssocsRes,length)>0]
    $b
    x y 
    1 2 
    
    $c
    z 
    1