I am using this custom featureGenerator:
AdaptiveFeatureGenerator featureGenerator = new CachedFeatureGenerator(
new AdaptiveFeatureGenerator[]{
new WindowFeatureGenerator(new TokenFeatureGenerator(), 2, 2),
new WindowFeatureGenerator(new TokenClassFeatureGenerator(true), 2, 2),
new OutcomePriorFeatureGenerator(),
new PreviousMapFeatureGenerator(),
new BigramNameFeatureGenerator(),
new SentenceFeatureGenerator(true, false),
new DictionaryFeatureGenerator("person", dictionary)
});
I only added the DictionaryFeatureGenerator with few entries:
Dictionary dictionary = new Dictionary();
dictionary.put(new StringList(new String[]{"giovanni"}));
dictionary.put(new StringList(new String[]{"maria"}));
dictionary.put(new StringList(new String[]{"luca"}));
I tried to look at the DictionaryFeatureGenerator.java code but i did not find anything about the extraction of the generated features of this generator.
So the question is, after adding this generator on the list of generators of my model, how can i extract features to understand what tokens are matching the entries of my dictionary?
Thank you!
A machine learning features does not guarantee that the token will be marked as a named entity. Its is like putting a flag in the token saying that the token occur in the dictionary, but it still need to be evaluated with other features.
You can skip the machine learning using a DictionaryNameFinder.