Search code examples
javaarrayswordssentence

Storing each sentence in an array from a document in java?


I want to split every sentence from a document and store each sentence in different arrays. Each array element is the word of the sentences. But i cant get far from this.

int count =0,len=0;
String sentence[];
String words[][];
sentence = name.split("\\.");
count = sentence.length;

System.out.print("total sentence: " );
System.out.println(count);
int h;  
words = new String[count][]; 

for (h = 0; h < count; h++) {
     String tmp[] = sentence[h].split(" ");
     words[h] = tmp;
     len = len + words[h].length;
     System.out.println("total words: " );
     System.out.print(len); 

     temp = sentence[h].split(delimiter);  

     for(int i = 0; i < temp.length; i++) {
        System.out.print(len);
        System.out.println(temp[i]);
        len++;
     }  
}

Solution

  • I can't understand your code, but here's how to achieve your stated intention with just 3 lines:

    String document; // read from somewhere
    
    List<List<String>> words = new ArrayList<>();
    for (String sentence : document.split("[.?!]\\s*"))
        words.add(Arrays.asList(sentence.split("[ ,;:]+")));
    

    If you want to convert the Lists to arrays, use List.asArray(), but I wouldn't recommend it. Lists are far easier to deal with than arrays. For one, they expand automatically (one reason why the above code is so dense).

    Addendum: (most) characters don't need escaping inside a character class.