Search code examples
phpyoutubedomdocument

Getting title of youtube videos using php domdocument


I have a function that gets the title from a HTML source (I curl it first then pass the source to this):

function get_dom_page_title($source){
    $doc = new DOMDocument('1.0', 'utf-8');
    $doc->formatOutput = false;
    $doc->preserveWhiteSpace = false;
    $doc->strictErrorChecking = false; 
    @$doc->loadHTML('<?xml encoding="UTF-8">' . $source);

    $title = $doc->getElementsByTagName("title")->item(0)->nodeValue;

    if ($title !== ""){
        return (string)$title;
    }
    else{
        return false;
    }
}

However when I type in a youtube linkhttp://www.youtube.com/watch?v=IFeE4q4-M0o, the title returned is all weird: ‪Arsenal vs Benfica FT Highlights‬†- YouTube, or \n \u202aArsenal vs Benfica FT Highlights\u202c\u200f\n - YouTube\n.

How can I sort this?


Solution

  • Use PHP Simple HTML DOM Parser

    Code:

    include("simple_html_dom.php");
    $html = file_get_html('http://www.youtube.com/watch?v=IFeE4q4-M0o');
    $title = $html->getElementsByTagName("title")->innertext;
    echo preg_replace('/&#x([0-9a-f]+);/ei', 'chr(hexdec("$1"))', $title)
    

    will output *Arsenal vs Merdosos FT Highlights,‏ - YouTube

    PHP Simple HTML DOM Parser means less code and consistent results :)