I used simple HTML DOM Parser to get the html from a page.
Now I want to scrape the file URL from the <script></script>
tags. This what I got:
<script type="text/javascript">
jwplayer("ContainerFlashPlayer").setup({
'autostart': 'true',
'primary': 'html5',
'flashplayer': '/images/embed/player.5.10.swf',
'file':'/zxdfgdfr44444/afrah/Basem_elkerbelay/selawat/guivvahpasjp.mp3',
'duration': '356.64975',
'image': '/images/flashimg.png',
'volume': '75',
'height': '240',
'width': '330',
'controlbar': 'bottom',
'stretching': 'fill',
'skin': '/images/embed/skin/shiavoice1.2.zip'
});
</script>
Now I want to get the file url. How can I do it?
You could do...
<?php
$string = "jwplayer(\"ContainerFlashPlayer\").setup({
'autostart': 'true',
'primary': 'html5',
'flashplayer': '/images/embed/player.5.10.swf',
'file':'/zxdfgdfr44444/afrah/Basem_elkerbelay/selawat/guivvahpasjp.mp3',
'duration': '356.64975',
'image': '/images/flashimg.png',
'volume': '75',
'height': '240',
'width': '330',
'controlbar': 'bottom',
'stretching': 'fill',
'skin': '/images/embed/skin/shiavoice1.2.zip'
});";
preg_match("~^\s*'file'\s*:\s*'(.*?)',?\s*$~m", $string, $file);
echo $file[1];
Output:
/zxdfgdfr44444/afrah/Basem_elkerbelay/selawat/guivvahpasjp.mp3
What does that regex say?
^
Start of the line
\s*
any number of whitespace charactersSearch for the following actual text
'file'
Again any number of whitespace characters with a colon separating it
\s*:\s*
A single quote then everything inbetween that and the next single quote
'(.*?)'
An optional comma, optional white space and then the end of the line
,?\s*$
The
m
after the closing delimiter is so the regex searches each line as its own line.
http://php.net/manual/en/reference.pcre.pattern.modifiers.php http://php.net/manual/en/function.preg-match.php