Search code examples
phpregexstringpreg-replacepcre

Removing spaces with preg_replace outputs unrecognized characters in string in php


i want to replace multiple spaces or   with one space so i have tried below code using preg_replace function,

so it replaces spaces correctly but it also puts unrecognized characters in output string,

for demo i am taking $string variable but in actual it can be data from serverside database, see below code:

<?php 
     $string = "123080345&nbsp;900113760  165604100012";
     echo preg_replace("/(\s|&nbsp;)+/",' ',$string);

     //output: 123080345� 900113760� 165604100012
     //expected output: 123080345 900113760 165604100012

So my question is why preg_replace putting unrecognized characters and how to get clean and clear output,

which doesn't have unrecognized characters as i have shown in above code as expected output


Solution

  • Although it's just string with numbers, spaces and &nbsp;,

    but i think in database it is UTF-8 encoded string so when i have tried preg_replace it is returning unrecognized characters.

    so it's working when i have tried same regular expression with /u identifier for unicode

    so below is my solution:

    <?php 
    
         $string = "123080345&nbsp;900113760  165604100012";
         echo preg_replace("/(\s|&nbsp;)+/u",' ',$string); //u for unicode characters 
    
         //output: 123080345 900113760 165604100012
    

    i hope it can be helpful to someone, thanks.