Search code examples
phpregexextractfile-get-contentswebvtt

Output src of track kind div


How can I output the SRC of the following div using php file_get_content from a given website?

<div class="videocontainer">
<input type="file" id="srtSelector"/>
<video class="video-js vjs-default-skin vjs-big-play-centered" id="olvideo">
<track kind="captions" src="https://rolled.oped.info/sub/jAghd9t8AB4/HfQZ32SovcY.vtt"/>
</video>
</div>

I'm interested in displaying the https://rolled.oped.info/sub/jAghd9t8AB4/HfQZ32SovcY.vtt part.

Thanks!


Solution

  • It would be better to try and use DOM rather than using regex's etc. DOMDocument is not always simple to use, but for your problem it may do the job...

    $html = <<< HTML
    <div class="videocontainer">
    <input type="file" id="srtSelector"/>
    <video class="video-js vjs-default-skin vjs-big-play-centered" id="olvideo">
    <track kind="captions" src="https://rolled.oped.info/sub/jAghd9t8AB4/HfQZ32SovcY.vtt"/>
    </video>
    </div>
    HTML;
    
    $dom = new DOMDocument();
    libxml_use_internal_errors(true);
    $dom->loadHTML($html);
    
    $tracks = $dom->getElementsByTagName("track");
    
    foreach ( $tracks as $track )   {
        echo (string)$track->getAttribute("src").PHP_EOL;
    }
    

    The code should be easy enough to follow.

    The biggest problem you can face is that if the HTML has errors (like the original one in the question) then this sometimes can make it difficult to load.