I want to rate most frequently words in sphinx index. The only one method I found it's /usr/bin/indexer -c /etc/sphinxsearch/sphinx.conf indexname --buildfreqs --buildstops /home/user/test.txt 1000
. But this method doesn't consider morphology. One word in different forms counting as several words. Maybe there's another method for count all indexed words?
As noted in comments, can use indextool --dumpdict
- which should give the word counts from the index. Because its from the index, its already been 'normalized' as per charset_table, wordforms, and even morphology.
(but only works on a dict=keywords
index)