I'm trying to create a filter to allow users to use only English letters (Lowercase & uppercase) and numbers. how can I do that? (ANSI) (not trying to sanitize, only to tell if a string contain non-english letters) That filter should get me a clean database with only english usernames, without multibyte and UTF-8 characters.
And can anyone explain to me why echo strlen(À) outputs '2'? it means two bytes right? wans't UTF-8 chars supposed to contain a single byte?
Thanks
This is how you check whether a string contains only letters from the English alphabet.
if (!preg_match('/[^A-Za-z0-9]/', $string)) {
//string contains only letters from the English alphabet
}
The other question:
strlen(À)
will not return 2. Maybe you meant
strlen('À')
strlen
returns
The length of the string on success, and 0 if the string is empty.
taken from here. So, that character is interpreted as two characters, probably due to your encoding.