Search code examples
phpregexpreg-matchstrpos

PHP Regex Match Specific Pattern between Quotes


I have strings with the following pattern:

adfadfadfadfadfadfafdadfa"externalId":"UCEjBDKfrqQI4TgzT9YLNT8g"afadfadfafadfdaffzfzfzxf

Basically, I need to find "externalId" and extract it's value in between the quotes that follow. The length of the value can change so it needs to be everything inside the two quotes. In this case the desired outcome is to return:

 UCEjBDKfrqQI4TgzT9YLNT8g

Here's what I have so far:

$test = file_get_contents('https://www.youtube.com/c/GhostTownLiving');
$test = htmlentities($test);

if (strpos($test, 'externalId') !== false) {
    echo 'true';
}

I tried Advanced HTML Dom but since these externalId property inside these YouTube channel pages are loaded via javascript I couldn't target it successfully.

Basically, i'm using htmlentities to return the code and then I'd like to extract the externalId value.

How can I write a regex pattern to match that? Thank you!


Solution

  • Parse out the whole JSON, then decode it and traverse though to what value you're after.

    <?php
    $test = file_get_contents('https://www.youtube.com/c/GhostTownLiving');
    
    // match the ytInitialData JSON
    preg_match('#var ytInitialData = {(.*?)};</script>#', $test, $matches);
    
    // add back the surounding {}'s, and parse
    $ytInitialData = json_decode('{'.$matches[1].'}');
    
    // then you have that massive object easily accessible
    echo $ytInitialData->metadata->channelMetadataRenderer->externalId; // UCEjBDKfrqQI4TgzT9YLNT8g
    
    

    Though, if you can obtain that from the API its friendlier then scraping