I have a corpus of documents, which I have already tagged. I have fixed list of about 400 tags - relating to different topics. Each document has been tagged with one or more tags, and a short title. (I also have a much larger list of titles - which I often re-use if the document contains very similar content)
I want to make an interface that will suggest tags/titles (from my existing lists) for new documents that I add to the corpus, based on how I have tagged the existing documents.
I have read about the probabilistic topic model LDA classes, which look great for analyzing text when you don't have any existing tagged data. But I don't see any way I can incorporate my existing work.
Any suggestions would be appreciated.
Kind Regards
Swami
For tags suggestion, our experience is just using a search engine, no need for topic modeling.
Try below steps:
This solution is workable.