Search code examples
javafileremoving-whitespace

Removing whitespaces in text file


I had to write a simple code that counts words in a text file. Then someone told me, that it's incomplete because when, for example, there will be 2 or more whitespaces in a row, function will count them as a words and result will be incorrect. So i tried to fix it by making a list and removing all " " elements there, but it doesn't seem to work. May you suggest what can be done?

Here's the code as it is now:

    int count = 0;
    File file = new File("C:\\Users\\user\\Desktop\\Test.txt");
    FileInputStream fis = new FileInputStream(file);
    byte[] bytesArray = new byte[(int) file.length()];
    fis.read(bytesArray);
    String s = new String(bytesArray);
    String[] data = s.split(" ");
    List<String> list = new ArrayList<>(Arrays.asList(data));
    list.remove(" ");
    data = list.toArray(new String[0]);
    for (int i = 0; i < data.length; i++) {
        count++;
    }
    System.out.println("Number of words in the file are " + count);

Solution

  • Be a nerd. You can do it in just one line, using classes in java.nio.file package :)

    int count = new String(Files.readAllBytes(Paths.get("/tmp/test.txt")), "UTF-8")
               .trim().split("\\s+").length;
    

    to count how many words are in the file. Or

    String result = new String(Files.readAllBytes(Paths.get("/tmp/test.txt")), "UTF-8")
               .trim().replaceAll("\\s+", " ");
    

    to have a single string with content correctly replaced.