Search code examples
phputf-8preg-replacenon-printable

UTF-8 PHP replace non printable character "space" with non printable character "line break"


I have a string in UTF-8 with non-printable space which I need to replace with non-printable linebreak

something like this str_replace('&nbsp;','<br />',$string); but with non-printable characters.


Solution

  • It literally works if you type in the specific character between the quotes:

    str_replace(' ', '', $string)
                 ^   ^^
            put characters here
    

    Since this can arguably be a bit hard to type and/or make the source code less than obvious, you can write those string literals in their byte notation. Just figure out what specific character you're talking about and what bytes it's encoded in:

    str_replace("\xE2\x80\xAF", "\x0A", $string)
    

    This is replacing a ZERO-WIDTH SPACE (UTF-8 encoding E2 80 AF) with a regular line feed (0A). Look it up in your Unicode table of choice. Possibly inspect your existing string using echo bin2hex($string) to figure out what bytes it contains.