Search code examples
phphtmlhtml-parsingsimple-html-dom

plaintext in php simple html parse dom with "a" and "img" tag


I have simple rich text like:

<div><p> some text <br/> some text <br/> 
    <img src=pic.jpeg> and  <a href="web.html">link</a> </p>
</div>

Is it possible that I get plaintext in this case with simple html dom:

   some text some text <img src=pic.jpeg> and <a href="web.html">link</a>

I mean every tag will be remove except a tag and img tag


Solution

  • Unless I misunderstand the problem, you can do that with just strip_tags():

    $str = '<div><p> some text <br/> some text <br/> 
              <img src=pic.jpeg> and  <a href="web.html">link</a> </p>
            </div>';
    
    echo htmlspecialchars(strip_tags($str, '<a><img>'));
    
    // result
    some text some text <img src=pic.jpeg> and <a href="web.html">link</a> 
    

    See an example.

    Note that I have only used htmlspecialchars to display the result in the browser.