Search code examples
phphtmlstringhtml-entitieshtml-escape-characters

How to remove HTML escape sequence characters from the string in PHP?


Suppose, I've a following string variable containing such kind of string.

$sample_String = "Dummy User graded your comment \"\r\n\t\t\t\t\tdoc_ck.docx\r\n\t\t\t\t\tDownload\r\n\t\t\t\t\t\" that you posted.";

Now I don't want these HTML characters in my string.

How should I remove them in an efficient and reliable way? I want the final output string as follows :

$sample_String = "Dummy User graded your comment \"doc_ck.docx Download\" that you posted.";

When it will be shown in a browser the '\' appearing before " will get disappear and the string in a browser will look like below :

Dummy User graded your comment "doc_ck.docx Download" that you posted.

Isn't it?

Thanks.

Till now I've tried below code but no success :

function br2nl($buff = '') {
    $buff = mb_convert_encoding($buff, 'HTML-ENTITIES', "UTF-8");
    $buff = preg_replace('#<br[/\s]*>#si', "\n", $buff);
    $buff = trim($buff);

    return $buff;
  }
$sample_String = br2nl(stripslashes(strip_tags($sample_String)));

Solution

  • If you just want to remove \r (carrige return) \n (newline) and \t (tab) you can do:

    $string = "Dummy User graded your comment \"\r\n\t\t\t\t\tdoc_ck.docx\r\n\t\t\t\t\tDownload\r\n\t\t\t\t\t\" that you posted.";
    $string = str_replace(array("\r", "\n", "\t"), "", $string);
    

    If you want to preserve the newlines (and have them show up in the browser) do:

    $string = "Dummy User graded your comment \"\r\n\t\t\t\t\tdoc_ck.docx\r\n\t\t\t\t\tDownload\r\n\t\t\t\t\t\" that you posted.";
    $string = nl2br(str_replace(array("\r", "\t"), "", $string));
    

    HTMLentities are sequences like &quot; and &#063;