Search code examples
javabufferedreadernon-ascii-characters

Testing for Non ASCII character not working Java


I have a text File that has a non-ASCII character I am attempting to detect the line where the character is in a file using BufferedReader:

public static void main(String[] args) throws FileNotFoundException, IOException {
        FileInputStream fs = new FileInputStream("C:\\Users\\Stanley\\Documents\\file.txt");
        BufferedReader br = new BufferedReader(new InputStreamReader(fs));
        String line;
        int count = 1;
        while ((line = br.readLine()) != null) {
            if (isAscii(line)) {
                System.out.println(line + " Number " + count);
            }
            count ++;
        }

    }

    public static boolean isAscii(String v) {
        byte bytearray[] = v.getBytes();
        CharsetDecoder d = Charset.forName("US-ASCII").newDecoder();
        try {
            CharBuffer r = d.decode(ByteBuffer.wrap(bytearray));
            r.toString();
        } catch (CharacterCodingException e) {
            return false;
        }
        return true;
    }

I have also tried this checker but the result is the same:

 private static boolean isAsciii(String input) {
        boolean isASCII = true;
        for (int i = 0; i < input.length(); i++) {
            int c = input.charAt(i);
            if (c > 0x7F) {
                isASCII = false;
                break;
            }
        }
        return isASCII;
    }

My Output is:

enter image description here

My Text File looks like this:

enter image description here

How am I supposed to check this.


Solution

  • If already have a String then iterate over each character and check if each character is in the range of printable ASCII characters space (0x20) to tilde (~).

      public static boolean isAscii(String v) {
          if (s != null && !s.isEmpty()) {
            for(char c : v.toCharArray()) {
                if (c < 0x20 || c > 0x7E) return false;
            }
          }
          return true;  
      }
    

    May also want to review the Character static methods; e.g. isLetter(), isISOControl(), etc. See Reference.