Search code examples
nlpnltknamed-entity-recognitionreinforcement-learning

NLTK NER: Continuous Learning


I have been trying to use NER feature of NLTK. I want to extract such entities from the articles. I know that it can not be perfect in doing so but I wonder if there is human intervention in between to manually tag NEs, will it improve?

If yes, is it possible with present model in NLTK to continually train the model. (Semi-Supervised Training)


Solution

  • The plain vanilla NER chunker provided in nltk internally uses maximum entropy chunker trained on the ACE corpus. Hence it is not possible to identify dates or time, unless you train it with your own classifier and data(which is quite a meticulous job).

    You could refer this link for performing he same.

    Also, there is a module called timex in nltk_contrib which might help you with your needs.

    If you are interested to perform the same in Java better look into Stanford SUTime, it is a part of Stanford CoreNLP.