Search code examples
phpencodingutf-8filteransi

Allow only English letters and numbers in php


I'm trying to create a filter to allow users to use only English letters (Lowercase & uppercase) and numbers. how can I do that? (ANSI) (not trying to sanitize, only to tell if a string contain non-english letters) That filter should get me a clean database with only english usernames, without multibyte and UTF-8 characters.

And can anyone explain to me why echo strlen(À) outputs '2'? it means two bytes right? wans't UTF-8 chars supposed to contain a single byte?

Thanks


Solution

  • This is how you check whether a string contains only letters from the English alphabet.

    if (!preg_match('/[^A-Za-z0-9]/', $string))  {
        //string contains only letters from the English alphabet
    }
    

    The other question:

    strlen(À)
    

    will not return 2. Maybe you meant

    strlen('À')
    

    strlen returns

    The length of the string on success, and 0 if the string is empty.

    taken from here. So, that character is interpreted as two characters, probably due to your encoding.