Search code examples
javaparsingartificial-intelligencenlpstanford-nlp

How can I split a text into sentences using the Stanford parser?


How can I split a text or paragraph into sentences using Stanford parser?

Is there any method that can extract sentences, such as getSentencesFromString() as it's provided for Ruby?


Solution

  • You can check the DocumentPreprocessor class. Below is a short snippet. I think there may be other ways to do what you want.

    String paragraph = "My 1st sentence. “Does it work for questions?” My third sentence.";
    Reader reader = new StringReader(paragraph);
    DocumentPreprocessor dp = new DocumentPreprocessor(reader);
    List<String> sentenceList = new ArrayList<String>();
    
    for (List<HasWord> sentence : dp) {
       // SentenceUtils not Sentence
       String sentenceString = SentenceUtils.listToString(sentence);
       sentenceList.add(sentenceString);
    }
    
    for (String sentence : sentenceList) {
       System.out.println(sentence);
    }