Search code examples
phpsimple-html-dom

Can't get data-title, just data-slug


HTML

<article class="movie-summary" data-slug="slug-goes-here" data-title="This is a Title">
...
...
</article>

PHP

$html = file_get_html( 'example.com' );
foreach( $html->find('article') as $data) {
    $property = 'data-title';
    echo $data->$property;
}

Hey all, so I want to be able to get all data-title from all articles off a particular site. When I use data-slug I get data back yet when I use data-title I get nothing, with the help of this post


Solution

  • If you look at the actual HTML code you are trying to parse (the link provided at comments), you see that it is not valid:

    <article  class="movie-summary hero" data-slug="aiyaary-hindi"data-title="Aiyaary">
    ...
    </article>
    

    Meaning, there is no space between data-slug and data-title attributes. So to fix this I suggest to add necessary spaces. Like so:

    function placeNeccessarySpaces($contents) {
        return preg_replace('/"data-title/', '" data-title', $contents);
    }
    

    This is similar to this answer. Then:

    $contents = placeNeccessarySpaces(file_get_contents('http://example.com'));
    $html = str_get_html($contents);
    foreach( $html->find('article') as $data) {
        $property = 'data-title';
        echo $data->$property;
    }