Search code examples
phpscreen-scrapingsimple-html-dom

How do you search by the contents of a tag in simplehtmldom?


I am trying to write a web scraper using simplehtmldom. I want to get a tag by searching the contents of the tag. This is the plaintext inside it, not the type of tag. Then once I have the tag by searching for the contents of its plain text I want to get the next tag after that.

How do I find a tag based on its contents? And once I have it how do I find the following tag?

Any Help would be appreciated.

Thanks.


Solution

  • The following will enable you to search all text nodes, then get the next tag:

    // Use Simple_HTML_DOM special selector 'text'
    // to retrieve all text nodes from the document
    $textNodes = $html->find('text');
    $foundTag = null;
    
    foreach($textNodes as $textNode) {
        if($textNode->plaintext == 'Hello World') {
            // Get the parent of the text node
            // (A text node is always a child of
            //  its container)
            $foundTag = $textNode->parent();
            break;
        }
    }
    
    if($foundTag) {
        $nextTagAfter = $foundTag->next_sibling();
    }
    

    This is not your first question about basic Simple_HTML_DOM usage. You might want to read the official documentation.