Search code examples
phppreg-replaceamp-htmldomparser

PHP - preg_replace YouTube embed no matter the order


I'm trying to capture 3 elements from YouTube embed codes but sometimes those elements are not in the same order or sometimes, the embed code contains more parameters.

I'd like to find a way to extract the video ID, the width and length in order to create a YouTube integration for AMP.

Example of embed:

<iframe width="560" height="315" src="https://www.youtube.com/embed/bpcNPHqs4ng" frameborder="0" allow="accelerometer; autoplay; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>

Should be transformed into:

<amp-youtube data-videoid="bpcNPHqs4ng" width="560" height="315" 
layout="responsive"></amp-youtube>

If the embed was always the same it would be easy to solve but sometimes the embed code starts with the source, sometimes with the width, ... So whatever the order I would need to capture the ID, the width and the height.

Can I do this with a preg_replace in PHP ?

I tried this:

preg_replace('/<iframe width="([0-9]+)" height="([0-9]+)" src="https:\/\/www.youtube.com\/embed\/([A-Za-z0-9]+)" (.*)><\/iframe>/', '<amp-youtube data-videoid="$3" width="$1" height="$2" layout="responsive"></amp-youtube>', $article);

$article contains the whole article in which the YouTube embed is used.

If a DOM parser can do the same, it's also ok for me but I'm less familiar with this.

Thanks


Solution

  • Here's a DOMDocument solution to your problem, using DOMXPath to search for iframe tags that have a src attribute that contains youtube, and then replacing them with a corresponding <amp-youtube> element:

    $doc = new DOMDocument();
    $doc->loadHTML($article, LIBXML_HTML_NODEFDTD);
    $xpath = new DOMXPath($doc);
    foreach ($xpath->query("//iframe[contains(@src, 'youtube')]") as $youtube) {
        // create a new node
        $node = $doc->createElement('amp-youtube');
        // set attributes
        $node->setAttribute('data-videoid', basename(parse_url($youtube->getAttribute('src'), PHP_URL_PATH)));
        $node->setAttribute('width', $youtube->getAttribute('width'));
        $node->setAttribute('height', $youtube->getAttribute('height'));
        $node->setAttribute('layout', 'responsive');
        // and now replace the old node
        $youtube->parentNode->replaceChild($node, $youtube);
    }
    echo $doc->saveHTML();
    

    Output (for my demo data):

    <html>
      <body>
        <div>some text</div>
        <iframe name="notyoutube" src="http://example.com"></iframe>
        <p>some more text</p> 
        <amp-youtube data-videoid="bpcNPHqs4ng" width="560" height="315" layout="responsive"></amp-youtube>
        <div>one last div</div>
      </body>
    </html>
    

    Demo on 3v4l.org