Search code examples
nlpnltkcjkpos-taggerthai

Korean, Thai and Indonesian POS tagger


Can someone recommend an open source POS tagger for Korean, Indonesian, Thai and Vietnamese?

That I can use to tag the corpus data that I currently have. (e.g. the stanford-postagger)

If you are a dev and care to share and let me test out the POS tagger, I don't mind either.

With some modifications of the output, I've POS tagged the Vietnamese data with jvntextpro

But I'd still like more input on Korean, Indonesian and Thai POS tagging.


Solution

  • After acl wiki: Korean morphological analyzer and part-of-speech tagger

    I would start to look on the websites of NLP research departments in Korea, Thailand, and Korean. On this page, you will find links to the research departments.

    Good luck!

    UPDATE: OpenNLP has thai PoS. Here are the models: http://opennlp.sourceforge.net/models/thai/ for PoS opennlp tagger.