AbstractSequenceClassifier.classifyAndWriteAnswersKBest
allows to pass a filename and an ObjectBank<List<IN>>
, but it's unclear from ObjectBank
's doc how to create such an ObjectBank
without involving a file.
I'm using CoreNLP 3.7.0 with Java 8.
You should just use this method instead:
Counter<List<IN>> classifyKBest(List<IN> doc, Class<? extends CoreAnnotation<String>> answerField, int k)
It will return a mapping of returned sequences to scores.
With this line of code you can turn that counter into a sorted list of sequences:
List<List<IN>> sorted = Counters.toSortedList(kBest);
I'm not sure exactly what you're trying to do, but typically IN is a CoreLabel. The key thing here is to turn your String into a list of IN's. This should be a CoreLabel, but I don't know the full details of the AbstractSequenceClassifier you are working with.
If you want to run your sequence classifier on a sentence, you could first tokenize it with a pipeline and then pass the list of tokens to classifyKBest(...)
For instance if in your example you are trying to get the k-best named entity tags:
// set up pipeline
Properties props = new Properties();
props.setProperty("annotators", "tokenize");
StanfordCoreNLP tokenizerPipeline = new StanfordCoreNLP(props);
// get list of tokens for example sentence
String exampleSentence = "...";
// wrap sentence in an Annotation object
Annotation annotation = new Annotation(exampleSentence);
// tokenize sentence
tokenizerPipeline.annotate(annotation);
// get the list of tokens
List<CoreLabel> tokens = annotation.get(CoreAnnotations.TokensAnnotation.class);
//...
// classifier should be an AbstractSequenceClassifier
// get the k best sequences from your abstract sequence classifier
Counter<List<CoreLabel>> kBestSequences = classifier.classifyKBest(tokens,CoreAnnotations.NamedEntityTagAnnotation.class, 10)
// sort the k-best examples
List<List<CoreLabel>> sortedKBest = Counters.toSortedList(kBestSequences);
// example: getting the second best list
List<CoreLabel> secondBest = sortedKBest.get(1);
// example: print out the tags for the second best list
System.out.println(secondBest.stream().map(token->token.get(CoreAnnotations.NamedEntityTagAnnotation.class)).collect(Collectors.joining(" ")));
// example print out the score for the second best list
System.out.println(kBestSequences.getCount(secondBest));
If you have more questions please let me know and I can help out!