I read this faq but i not understand. I try with this code:
Properties pp=new Properties();
pp.put("annotators", "tokenize, ssplit, pos, lemma, ner, parse");
pp.put("ner.useSUTime","false");
pp.put("useGazettes","true");
pp.put("gazette","C:\\gaz.txt");
StanfordCoreNLP s=new StanfordCoreNLP(pp);
This is String: "Dan became a member of the Music friends association in 2008"
the gazette file is:
CLASS Music friends association
But "Music friends association" is not recognized by NER.
Where am I wrong?
The answer is given there:
If a gazette is used, this does not guarantee that words in the gazette are always used as a member of the intended class, and it does not guarantee that words outside the gazette will not be chosen. It simply provides another feature for the CRF to train against. If the CRF has higher weights for other features, the gazette features may be overwhelmed.
So there is not guarantee that your phrase will be tagged in any way. The alternative is
either the regexner or the tokensregex tools included in Stanford CoreNLP