Search code examples
phpregexexplodepreg-split

Is there any option to separate special characters from words with regex/preg_split?


I'm junior and not at ease with regex, and I'm trying to do a password generator with sentences using regex and preg_split.

All is done except one thing, for example the sentence "I've got 2 cats." should result as "I'vg2c." but the only thing I have is "Ig2c" because I split with white spaces ( preg_split("/[\s]|/", $string, -1, PREG_SPLIT_NO_EMPTY); ) and indeed there isn't any white space between words and special characters.

So is there any ""simple"" option to separate special characters from words and keep it, using regex/preg_split or something else ? :s (Don't know if I'm clear, sorry for my english)

Here is the code :

session_start();


$string = !empty($_POST['sentence']) ? $_POST['sentence'] : NUll;

function initiales($string)
{
  $words = preg_split("/[\s]/", $string, -1, PREG_SPLIT_NO_EMPTY);
  // $words = explode(" ", $string);
   $initiale = '';
   foreach($words as $init){
     $initiale .= $init{0};
   }
  return $initiale;
}
?>



What I want : 

input: initiales("I've got 21 cats and 1 dog!");

expected_output: "I'vg21ca1d!"

unexpected_output: "Ig2ca1d"



Solution

  • You may use

    function initiales($string) { 
        return preg_replace('#\B\p{L}\p{M}*+|\s+#u', '', $string); 
    }
    

    See the PHP demo

    The pattern matches

    • \B\p{L}\p{M}*+ - any letter not at the start of a word + any diacritics after it
    • | - or
    • \s+ - 1 or more whitespaces.

    The u modifier is used to match any Unicode whitespace and makes \B Unicode aware.