For one of my projects I need to split paragraphs into sentences. I have already found that you can use the following code to break the paragraph(s) into different sentences then print them:
BreakIterator iterator = BreakIterator.getSentenceInstance(Locale.US);
iterator.setText(content);
int start = iterator.first();
for (int end = iterator.next();
end != BreakIterator.DONE;
start = end, end = iterator.next()) {
System.out.println(content.substring(start,end));
Where the variable 'content' is a predefined variable.
However, I would like to have the broken down sentences to be strings so that I can continue using them.
How would I do this? I think it may have something to do with a string array. Thanks for your help.
I've never used BreakIterator
, I assume you want it for locale purposes (FYI: here and here). Either way, you can keep the sentences in an array or List
, as you've mentioned.
BreakIterator iterator = BreakIterator.getSentenceInstance(Locale.US);
iterator.setText(content);
int start = iterator.first();
List<String> sentences = new ArrayList<String>();
for (int end = iterator.next(); end != BreakIterator.DONE; start = end, end = iterator.next()) {
//System.out.println(content.substring(start,end));
sentences.add(content.substring(start,end));
}