I've noticed the whole library is quite large, ~300MB. But I'm only using tokenize, ssplit, pos. How can I make a light library? Many thanks.
Best, Huang
If you only want part of speech tags, you can include just the part of speech tagger models; for example, as downloaded from: nlp.stanford.edu/software/tagger.shtml. You can also safely just go ahead and remove unwanted models from the models jar to make it smaller.