When I view-source a html page, I saw this in text/javascript tag:
playlist = [{
title: "",
thumnail: "//example.com/folder/c9cc7f89fe5c168551bca2111d479a3e_1515576875.jpg",
source: "https://examp.com/360/HX62.mp4?authen=exp=1517246689~acl=/82vL3DDTye4/*~hmac=977cefd9de63a29fde25c856e0fdfd2f",
sourceLevel: [
{
source: "https://examp.com/360/HX62.mp4?authen=exp=1517246689~acl=/82vL3DDTye4/*~hmac=977cefd9de63a29fde25c856e0fdfd2f",
label: '360p'
},
{
source: "https://examp.com/480/HX62.mp4?authen=exp=1517246689~acl=/SuCa7NnGEhM/*~hmac=80bc89a07b1f4ed87d584a89c623e946",
label: '480p'
},
{
source: "https://examp.com/720/HX62.mp4?authen=exp=1517246689~acl=/SuCa7NnGEhM/*~hmac=80bc89a07b1f4ed87d584a89c623e946",
label: '720p'
},
],
}];
I want to get strings in source and label, then I've write this code:
$page = curl ('https://example.com/video-details.html')
preg_match ('#sourceLevel:[{source: "(.*?)",label: \'360p\'},{source: "(.*?)",label: \'480p\'},{source: "(.*?)",label: \'720\'}#', $page, $source);
$data360 = $source[1];
$data480 = $source[2];
$data720 = $source[3];
echo $data360. '<br/>' .$data480. '<br/>' .$data720. '<br/>';
I know it can be wrong in somewhere, because I'm new to PHP. I'm hoping there is someone help me to correct my code. Many thanks!
You need to:
{
) in your page string.I would also suggest to match the source/labels each as separate matches, so that when there are not exactly three, you will still have them all.
Here is the suggested code:
preg_match_all('~\{\s*source\s*:\s*"(.*?)"\s*,\s*label\s*:\s*\'(.*?)\'\s*\}~',
$page, $sources);
$sources = array_combine($sources[2], $sources[1]);
This will provide the $sources
variable as an associative array, keyed by the labels:
[
"360p" => "https://examp.com/360/HX62.mp4?authen=exp=1517246689~acl=/82vL3DDTye4/*~hmac=977cefd9de63a29fde25c856e0fdfd2f",
"480p" => "https://examp.com/480/HX62.mp4?authen=exp=1517246689~acl=/SuCa7NnGEhM/*~hmac=80bc89a07b1f4ed87d584a89c623e946",
"720p" => "https://examp.com/720/HX62.mp4?authen=exp=1517246689~acl=/SuCa7NnGEhM/*~hmac=80bc89a07b1f4ed87d584a89c623e946"
]