Search code examples
javascriptregexgsm

How to Detect Non "GSM 7 bit alphabet" characters in input field


I am trying to detect if a text input field has any character that doesn't belong to the GSM 7 bit alphabet. The table with the characters is here http://www.dreamfabric.com/sms/default_alphabet.html

After a lot of searching I found this (What regular expression do I need to check for some non-latin characters?) that its pretty close to what I want to accomplish because It detects Non latin characters. How can I alter the regular expression to include the GSM 7 bit alphabet?

<!DOCTYPE HTML>
<html lang="en-US">
<head>
    <meta charset="UTF-8">
    <title>test foreign chars</title>
</head>
<body>

    <input id="foreign_characters" size="12" type="text" name="foreign_characters" value="test">

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.8.2/jquery.min.js"></script>
<script type="text/javascript">

(function(){

    $('#foreign_characters').on("keyup", function(){

        var foreignCharacters = $("#foreign_characters").val();
        var rforeign = /[^\u0000-\u007f]/;

        if (rforeign.test(foreignCharacters)) {
          alert("This is non-Latin Characters");
        } else {
          alert("This is Latin Characters");
        }

    });

})();

    </script>
</body>
</html>

Solution

  • You can put all valid characters in a string and then search the string repeatedly.

    gsm = "@£$¥èéùìòÇØøÅåΔ_ΦΓΛΩΠΨΣΘΞ^{}\[~]|€ÆæßÉ!\"#¤%&'()*+,-./0123456789:;<=>?¡ABCDEFGHIJKLMNOPQRSTUVWXYZÄÖÑܧ¿abcdefghijklmnopqrstuvwxyzäöñüà";
    var letter = 'a';
    var letterInAlfabet = gsm.indexOf(letter) !== -1;
    

    Make sure you get your encodings right if you use this, i.e. save your Javascript file as UTF8 and specify that it is UTF8 to the browser.