I'm totally new to R and I'm currently working with the tm and lda packages to analyze a log.
The lda.collapsed.gibbs.sampler can take an "initial" parameter, and in documentation it's stated:
initial
A list of initial topic assignments for words. It should be in the same format as the assignments field of the return value. If this field is NULL, then the sampler will be initialized with random assignments.
But when i try to iterate passing a previous result$assignments as parameter in initial, i get an error:
> result <- lda.collapsed.gibbs.sampler(data, K,vocab,i, 0.1,0.1, initial = lda_result$assignments, compute.log.likelihood=TRUE)
Error in structure(.Call("collapsedGibbsSampler", documents, as.integer(K), :
STRING_ELT() can only be applied to a 'character vector', not a 'NULL'
I don't know how to get rid of that and actually use the list. What I want is to have a measure of convergence by taking steps and looking at that results, so to simply put i as a bigger number won't work.
Thanks in advance! :)
The documentation is a bit spotty here. You need to set initial=list(assignments = lda_result$assignments)
. More generally, initial
is a list which must have either assignments
set or both topics
and topic_sums
set.