Search code examples
phpregexhidden

PHP Regex line-breaks won't be removed from imported Facebook description data


I'm on the edge of going crazy here. I'm trying to remove line-endings from a string in php, but no existing method seems to work and I can't get my head around why not.

I've tried the following methods:

$result = str_replace(array("\n", "\r"), '', $string));
$result = preg_replace("/\n/, '', $string"));
$result = preg_replace("/\n|/r|/s/, '', $string")); // this one does remove the whitespace though
$result = str_replace(PHP_EOL, '',trim(rtrim($string)));

And many, many more variations...

This makes me think that it might be something else causing the problem, because my test is simple.

var_dump($originalString);
$test = " stringl \n";
$testWithoutLineBreak = preg_replace("/\n/", '', $test);
$originalString = preg_replace("/\n/", '', $originalString);
var_dump($test);
var_dump($testWithoutLineBreak);
var_dump($originalString);

Gives the following result:

string(13) " stringl
"
string(10) " stringl 
"
string(9) " stringl "
string(13) " stringl
"

Notice the number difference, the test string I made to replicate the original contains 10 characters with the linebreak, while the original string contains 13. Also the preg_replace works on my test string but not on the original.

Lastly I tried putting it all in a hidden char revealer:

string(13)[Space]"[Space]stringl[End of Line(LF)]
"[End of Line(LF)]
string(10)[Space]"[Space]stringl[Space][End of Line(LF)]
"[End of Line(LF)]
string(9)[Space]"[Space]stringl[Space]"[End of Line(LF)]
string(13)[Space]"[Space]stringl[End of Line(LF)]
"

No result there.

Anyone an explantion for this magic? Thanks!


Solution

  • Sometime in these situations you get tricked by characters/byte values you can’t “see” when viewing the string as plain text or HTML.

    You can for example use urlencode to make a debug output of the value, so that you can determine what the actual byte values at those positions are.

    That usually helps narrow problems like this down.