Search code examples
javastanford-nlpnamed-entity-recognition

Stanford NER: extracting separate lists of entities?


I can get a string annotated with Named Entities with the following code.

String NEString =  classifier.classifyWithInlineXML(fileContents)

I'm wondering if there is any method to call so that I can get separate entities (PERSON, ORGANIZATION, LOCATIOIN) lists in the file, that way I don't have to parse the retrieved string with the above method to get the entity lists?


Solution

  • In my opinion, the cleanes way to run the classification is:

    List<Triple<String,Integer,Integer>> out = classifier.classifyToCharacterOffsets(text);
    triple.first(): entity type
    triple.second(): start position
    triple.third(): end position
    

    It groups consequent entities and returns the start and end position of entities.