I'm making an AngularJS filter which capitalizes each word's first letter. It works well with a-zA-Z letters, but in my case I use also cyrillic characters and I would like to make it work.
var strLatin = "this is some string";
var strCyrillic = "това е някакъв низ";
var newLatinStr = strLatin.replace(/\b[\wа-яА-Я]/g, function(l){
return l.toUpperCase();
});
var newCyrillicStr = strCyrillic.replace(/\b[\wа-яА-Я]/g, function(l){
return l.toUpperCase();
});
Here I got some CodePen example: http://codepen.io/brankoleone/pen/GNxjRM
You need a custom word boundary that you may build using groupings:
var strLatin = "this is some string";
var strCyrillic = "това е някакъв низ";
var block = "\\w\\u0400-\\u04FF";
var rx = new RegExp("([^" + block + "]|^)([" + block + "])", "g");
var newLatinStr = strLatin.replace(rx, function($0, $1, $2){
return $1+$2.toUpperCase();
});
console.log(newLatinStr);
var newCyrillicStr = strCyrillic.replace(rx, function($0, $1, $2){
return $1+$2.toUpperCase();
});
console.log(newCyrillicStr);
Details:
block
contains all ASCII letters, digits and underscore and all basic Cyrillic chars from the basic Cyrillic range (if you need more, see Cyrillic script in Unicode ranges Wiki article and update the regex accordingly), perhaps, you just want to match Russian with А-ЯЁёа-я
, then use var block = "\\wА-ЯЁёа-я