Search code examples
javafilebufferedreader

read a file char by char and find the input string in it


I have a file that I need to read character by character and find the input string. I need to return how many times "input string" appears in the file but I need to read file character by character only.

I came up with below code but I am having issues figuring out how to find string in the file by reading char by char. I was iterating for loop first and then inside I have while loop but If char doesn't match then I need to start again from the for loop and I am not able to figure out how I can do it?

  public static void main(String[] args) throws IOException {
    String input = "hello world"; // "hello";
    handleFile(new File("some_file"), input);
  }

  private static int handleFile(File file, String input) throws IOException {
    int count = 0;
    try (BufferedReader br =
        new BufferedReader(new InputStreamReader(new FileInputStream(file),
            Charset.forName("UTF-8")))) {
      char[] arr = input.toCharArray();
      int r;

      // confuse here what logic I should have here?
      for (char a : arr) {
        while ((r = br.read()) != -1) {
          char ch = (char) r;
          if (ch == a) {
            break;
          }
        }
      }
    }

    return count;
  }

Solution

  • So, conceptually, you need to maintain an offset of the number of matched characters, each time a mismatch occurs, you reset the offset back to 0. This offset is used to determine if the given character in the input matches the next character in the file

    A simple implementation might look something like...

    String value = "Thistestistestatesttest";
    String input = "test";
    
    int offset = 0;
    int matches = 0;
    for (char next : value.toCharArray()) {
        if (next == input.charAt(offset)) {
            offset++;
            if (offset == input.length()) {
                matches++;
                offset = 0;
            }
        } else {
            offset = 0;
        }
    }
    System.out.println("Found " + matches);
    

    Note, I've deliberately used a String as the source, so you can test it and better understand how it works and the take the logic an implement it your own solution.

    Now, if you take the time to desk-check the problem, it might look something like..

    +======+========+==============+=======+x
    | Next | offset | offset value | match |
    +======+========+==============+=======+
    | T    |      0 | t            | false |
    +------+--------+--------------+-------+
    | h    |      0 | t            | false |
    +------+--------+--------------+-------+
    | i    |      0 | t            | false |
    +------+--------+--------------+-------+
    | s    |      0 | t            | false |
    +------+--------+--------------+-------+
    | t    |      0 | t            | true  |
    +------+--------+--------------+-------+
    | e    |      1 | e            | true  |
    +------+--------+--------------+-------+
    | s    |      2 | s            | true  |
    +------+--------+--------------+-------+
    | t    |      3 | t            | true  |
    +------+--------+--------------+-------+
    | i    |      0 | t            | false |
    +------+--------+--------------+-------+
    | s    |      0 | t            | false |
    +------+--------+--------------+-------+
    | t    |      0 | t            | true  |
    +------+--------+--------------+-------+
    | e    |      1 | e            | true  |
    +------+--------+--------------+-------+
    | s    |      2 | s            | true  |
    +------+--------+--------------+-------+
    | t    |      3 | t            | true  |
    +------+--------+--------------+-------+
    | a    |      0 | t            | false |
    +------+--------+--------------+-------+
    | t    |      0 | t            | true  |
    +------+--------+--------------+-------+
    | e    |      1 | e            | true  |
    +------+--------+--------------+-------+
    | s    |      2 | s            | true  |
    +------+--------+--------------+-------+
    | t    |      3 | t            | true  |
    +------+--------+--------------+-------+
    | t    |      0 | t            | true  |
    +------+--------+--------------+-------+
    | e    |      1 | e            | true  |
    +------+--------+--------------+-------+
    | s    |      2 | s            | true  |
    +------+--------+--------------+-------+
    | t    |      3 | t            | true  |
    +------+--------+--------------+-------+