Search code examples
phphtmlsimple-html-dom

PHP simple HTML DOM parse


I want to extract some information from some html code using dom parser, but I'm stuck at a point.

<div id="posts">
    <div class="post">
        <div class="user">me:</div>
        <div class="post">I am an apple</div>
    </div>
    <div class="post">
        <div class="user">you:</div>
        <div class="post">I am a banana</div>
    </div>
    <div class="post">
        <div class="user">we:</div>
        <div class="post">We are fruits</div>
    </div>
</div>

This will print the users.

$users= $html->find('div[class=user]');
foreach($users as $user)
    echo $user->innertext;

This will print the posts.

$posts = $html->find('div[class=post]');
foreach($posts as $post)
    echo $post->innertext;

I want to print them together, and not sepparately, like so:

me:
I am an apple
you:
I am a banana
we:
We are fruits

How can I do this using the parser?


Solution

  • Using the markup you provided, you can just point out the children of the main div (div#posts), then loop all children. Then for each children just get the first and second ones:

    foreach($html->find('div#posts', 0)->children() as $post) {
        $user = $post->children(0)->innertext;
        $post = $post->children(1)->innertext;
        echo $user . '<br/>' . $post . '<hr/>';
    }
    

    Though I would really suggest use DOMDocument with this:

    $dom = new DOMDocument;
    $dom->loadHTML($html_markup);
    $xpath = new DOMXpath($dom);
    $elements = $xpath->query('//div[@id="posts"]/div[@class="post"]');
    foreach($elements as $posts) {
        $user = $xpath->evaluate('string(./div[@class="user"])', $posts);
        $post = $xpath->evaluate('string(./div[@class="post"])', $posts);
        echo $user . '<br/>' . $post . '<hr/>';
    }