Search code examples
stanford-nlp

how to make a light-weighted stanford-nlp.jar


I've noticed the whole library is quite large, ~300MB. But I'm only using tokenize, ssplit, pos. How can I make a light library? Many thanks.

Best, Huang


Solution

  • If you only want part of speech tags, you can include just the part of speech tagger models; for example, as downloaded from: nlp.stanford.edu/software/tagger.shtml. You can also safely just go ahead and remove unwanted models from the models jar to make it smaller.