Search code examples
javascriptregexvalidation

Validating user's UTF-8 name in Javascript


I am using the following JavaScript regexp to validate users First and Last names

var regexp = /^((?=[a-z \']).)+$/i;

var val1 = "Normal Text' Compromised";       // true
var val2 = "UTF Text' Połącz Słońce w Mózu"; // false  <---- UTF-8
var val3 = "Illegal char: Blac & White";     // false

example at: http://jsfiddle.net/PR4T2/1/


Question:

Is there any way to make the regex "UTF-8 insensitive" in order to let users use UTF characters ?

I know that UTF is not supported in JS validation yet but I was wondering if there is any workaround. I also do not want to exclude all illegal characters manually like:

var regexp = /^((?![0-9\~\!\@\#\$\%\^\&\*\(\)\_\+\=\-\[\]\{\}\;\:\"\\\/\<\>\?]).)+$/;

Edit:

Criteria are characters: a-z, space, \, ' and any other UTF char that can be found in user First/Last Name like here.

I'm looking for something more general like \p{xx} sequencer in PHP


Solution

  • The XRegExp library Unicode plugin adds Unicode character class support (like "\p{L}") to JavaScript regular expressions.