JavaScript Split String with regex - match (how to replace match regex into split function)

let s = "AbcDefGH"
s.match(/[A-Z]+[a-z]*/g)
["Abc", "Def", "GH"] // This is what expecting with split function

s.split(/(?=[A-Z]+[a-z]*)/g)
["Abc", "Def", "G", "H"] // "G" and "H" are separated.

My Question is how can I replace match regex into split function to get the same result of match.

Please explain what for ?= when match regex is translated to split function

Thanks

Solution

You can enable JavaScript's u or v flag to have Unicode and character set features. This way, you can use \p{L} to match any letter in any language. This will be safer than using [a-zA-Z] as it will match accented characters and also non-Latin characters.

In your case, we want to match between a lowercase and an uppercase letter. So we'll use a positive lookbehind to find a lowercase letter, followed by a positive lookahead to find an uppercase letter:

lookbehind: (?<=\p{Ll}), where \p{Ll} will match a lowercase letter in any language, so for example "a", "à" or "ÿ".
lookahead: (?=\p{Lu}), where \p{Lu} will match an uppercase letter in any language, so for example "C", "Ç" or "É".

Here is the detailed list of Unicode categories.

And a little example of code to illustrate it:

// Enable the `u` or `v` flag to have Unicode and character set features.
// \p{L} matches any Unicode letter
// \p{Ll} matches any Unicode lowercase letter.
// \p{Lu} matches any Unicode uppercase letter.
// (?<=\p{Ll}) is a positive lookbehind to find a lowercase letter.
// (?=\p{Lu}) is a positive lookahead to find an uppercase letter.

const regex = /(?<=\p{Ll})(?=\p{Lu})/gu;

const inputs = [
  "AbcDefGH",
  "J'aiMangéEtBuÀVolonté",
  "IAteAndDrankAsMuchAsIWanted"
];

inputs.forEach((input) => {
  console.log('Input = "' + input + '"');
  console.log(input.split(regex));
});