Characters not getting matched by [A-Za-z]

I am trying to match all latin characters in UTF 16 encoded text. I have been using [A-Za-z] which has been working great. As I've been parsing chinese and japanese text I've been coming across bizarre versions of A-Z that the regex isn't picking up.

https://gist.github.com/kyleect/1c66fd388d362653969d

Left are the characters I can't identify, right is from my keyboard. I copy and pasted them in to chrome page find input, google search and the find input in my text editor. All agree: Left == Right but Right != Left

What are these characters and wow do I target them in regex?

Solution

You can take a look at their character codes in your browser’s console:

> 'Ｂ'.charCodeAt(0).toString(16)
ff22

It’s a fullwidth letter! You can probably match the whole set with [\uff21-\uff3a] in a decent regex engine. Or Ａ-Ｚ in an even more decent one.

Regex to validate password strength
Detect and parse a time range in a format like "2-4pm" for Amazon Lex scheduling bot
What is the meaning of the Regular Expression ^(.)\1+$
How to find and replace with regex in excel
Issue with Regex Pattern (split string with characters & numbers)
How to get the first Tamil letter in a word?
The method of using Regular Expression to express Date and Time: YYYY-MM-DD HH:MM:SS.XXX
validate the credit card expiry date using java?
Using disjunction (OR) in a lookbehind
See if a string contains any characters in it
Multibyte trim in PHP?
match only the last instance of a pattern with Javascript regexp
Regex to detect negative numbers but not hyphenated numbers
How to Fix regular expression capturing group error Bigquery?
Improving the below RegEx for US and UK Names
R quanteda kwic not matching negative look behind pattern
Help with drivers license number validation regex
Regex for matching grey colors in hexadecimal notation
Extract email address from string
Regex for ISO 8601 durations
How to find second match in sql
Regular expression to remove HTML tags
How to replace host part of a URL using javascript regex
How do I match all four cases in Regex?
How to replace a period that is between letters, but not numbers?
How to strip quotes from a variable in Ansible?
Regular expression to accept negative number
Negative Lookahead not working in perl regex
How can I use perl to delete files matching a regex
Is it possible to get a list of strings from a regex?