Search code examples
phpdomvariableshtml-content-extraction

Get element content from a variable containing html


How do I use the DOM parser to extract the content of a html element in a variable.

More exactly: I have a form where user inputs html in a text area. I want to extract the content of the first paragraph.

I know there are many tutorials on this, but could not find any on extracting from variable and not a file(page)

Thanks


Solution

  • If you're taking HTML as user input, I recommend using simplehtmldom. It has a loose parser with tolerance for buggy html and lets you use CSS selectors to pull element and their content out of the DOM.

    I didn't test this, but it should work:

    $html = str_get_html($_POST['input']);
    print $html->find('p:first')->plaintext; // first paragraph