Search code examples
javascriptregexunicodeunicode-string

How to detect if string contains Amharic language in javascript?


I need to check if a string contains Amharic language, it can contain English characters as well:

const amharic = "የሙከራ test ሕብረቁምፊ";
amharc.match(pattern)

Solution

  • Using UTF-16 range and the charCodeAt() method:

    The UTF-16 range for the amharic letters are from 4608 to 5017 and from 11648 to 11743 so you can use the charCodeAt() method to check if the string characters fall within those two ranges or not.


    Check and run the following Code Snippet for a practical example of what I described above:

    var string = "የሙከራ test ሕብረቁምፊ";
    
    function checkAmharic(x) {
        let flag = false;   
        [...x].forEach((e, i) => {
        	if (((e.charCodeAt(i) > 4607) && (e.charCodeAt(i) < 5018)) || ((e.charCodeAt(i) > 11647) && (e.charCodeAt(i) < 11743))) {
          	if (flag == false) {
            	flag = true;
            }
          }
        })
        return flag; 
    }
    
    console.log(checkAmharic(string)); // will return true
    console.log(checkAmharic("Hello All!!")); // will return false


    Using ASCII range and regex:

    The ASCII range for the amharic letters are from 1200 to 137F so you can use regex to check if the string characters fall within those two ranges or not.


    Check and run the following Code Snippet for a practical example of what I described above:

    var string = "የሙከራ test ሕብረቁምፊ";
    
    function checkAmharic(x) {
        return /[\u1200-\u137F]/.test(x); // will return true if an amharic letter is present
    }
    
    console.log(checkAmharic(string)); // will return true
    console.log(checkAmharic("A")); // will return false