Search code examples
javacounting

Counting number of time the articles "a","an" are being used in a text file


I'm trying to make a program that count the number of words, lines, sentences, and also the number of articles 'a', 'and','the'. So far I got the words, lines, sentences. But I have no idea who I am going to count the articles. How can a program make the difference between 'a' and 'and'.

This my code so far.

 public static void main(String[]args) throws FileNotFoundException, IOException        
    {       
FileInputStream file= new FileInputStream("C:\\Users\\nlstudent\\Downloads\\text.txt");
Scanner sfile = new Scanner(new File("C:\\Users\\nlstudent\\Downloads\\text.txt"));

  int ch,sentence=0,words = 0,chars = 0,lines = 0; 

  while((ch=file.read())!=-1)
  {
   if(ch=='?'||ch=='!'|| ch=='.')
    sentence++;
  }

    while(sfile.hasNextLine())  {
        lines++;
    String line = sfile.nextLine();
        chars += line.length();
        words += new StringTokenizer(line, " ,").countTokens();
    }


System.out.println("Number of words: " + words);
System.out.println("Number of sentence: " + sentence);
System.out.println("Number of lines: " + lines);
System.out.println("Number of characters: " + chars);
}
}

Solution

  • The tokenizer will split each line into tokens. You can evaluate each token (a whole word) to see if it matches a string you expect. Here is an example to count a, and, the.

    int a = 0, and = 0, the = 0, forCount = 0;
    
    while (sfile.hasNextLine()) {
        lines++;
        String line = sfile.nextLine();
        chars += line.length();
        StringTokenizer tokenizer = new StringTokenizer(line, " ,");
        words += tokenizer.countTokens();
    
        while (tokenizer.hasMoreTokens()) {
            String element = (String) tokenizer.nextElement();
    
            if ("a".equals(element)) {
                a++;
            } else if ("and".equals(element)) {
                and++;
            } else if ("for".equals(element)) {
                forCount++;
            } else if ("the".equals(element)) {
                the++;
            }
        }
    }