I want to download source code from url and find out the specific text and store it into variables.
Suppose I have url http://www.homedepot.com/p/Ryobi-185-MPH-510-CFM-Gas-Backpack-Blower-RY08420A/203312654
I want to download its source code and find out below text which is bottom of source code.
Also store each variable like CI_Pagetype
, CI_ItemID
in php variable so I can store it in csv file.
<script>
var CI_Pagetype = 'PRODUCT';
var CI_ItemID = '203312654';
var CI_ItemName = '185 MPH 510 CFM Gas Backpack Blower';
var CI_CatID = '556375';
var CI_CatName = '';
var CI_ItemPrice = $('#ciItemPrice').val();
var CI_ItemMfr = 'Ryobi';
var CI_ItemMfrNum = '573539';
var CI_ItemUPC = '046396001122';
var CI_ItemAvailability = $('#ciItemAvailability').val();
var CI_ItemISBN = '';
var CI_ItemShipWeight = '22';
Currently I can download source code using file_get_contents();
But I am not sure how can I write regexp or extract that data.
Please help me out to find solutions.
Via this site : https://regex101.com/
With this regex : var (CI_)([A-Za-z0-9]*) = '([a-zA-z0-9 ]*)';
Use it with g
(global) parameter
For this sample :
<script>
var CI_Pagetype = 'PRODUCT';
var CI_ItemID = '203312654';
var CI_ItemName = '185 MPH 510 CFM Gas Backpack Blower';
var CI_CatID = '556375';
var CI_CatName = '';
var CI_ItemPrice = $('#ciItemPrice').val();
var CI_ItemMfr = 'Ryobi';
var CI_ItemMfrNum = '573539';
var CI_ItemUPC = '046396001122';
var CI_ItemAvailability = $('#ciItemAvailability').val();
var CI_ItemISBN = '';
var CI_ItemShipWeight = '22';
var bcData = new Object();
Result :
MATCH 1
1. [19-22] `CI_`
2. [22-30] `Pagetype`
3. [34-41] `PRODUCT`
MATCH 2
1. [52-55] `CI_`
2. [55-61] `ItemID`
3. [65-74] `203312654`
MATCH 3
1. [85-88] `CI_`
2. [88-96] `ItemName`
3. [100-135] `185 MPH 510 CFM Gas Backpack Blower`
MATCH 4
1. [146-149] `CI_`
2. [149-154] `CatID`
3. [158-164] `556375`
MATCH 5
1. [175-178] `CI_`
2. [178-185] `CatName`
3. [189-189] ``
MATCH 6
1. [248-251] `CI_`
2. [251-258] `ItemMfr`
3. [262-267] `Ryobi`
MATCH 7
1. [278-281] `CI_`
2. [281-291] `ItemMfrNum`
3. [295-301] `573539`
MATCH 8
1. [312-315] `CI_`
2. [315-322] `ItemUPC`
3. [326-338] `046396001122`
MATCH 9
1. [411-414] `CI_`
2. [414-422] `ItemISBN`
3. [426-426] ``
MATCH 10
1. [437-440] `CI_`
2. [440-454] `ItemShipWeight`
3. [458-460] `22`
Price and availability is function so there are no value.
$re = "/var (CI_)([A-Za-z0-9]*) = '([a-zA-z0-9 ]*)';/";
$str = "<script>\nvar CI_Pagetype = 'PRODUCT';\nvar CI_ItemID = '203312654';\nvar CI_ItemName = '185 MPH 510 CFM Gas Backpack Blower';\nvar CI_CatID = '556375';\nvar CI_CatName = '';\nvar CI_ItemPrice = \$('#ciItemPrice').val();\nvar CI_ItemMfr = 'Ryobi';\nvar CI_ItemMfrNum = '573539';\nvar CI_ItemUPC = '046396001122';\nvar CI_ItemAvailability = \$('#ciItemAvailability').val();\nvar CI_ItemISBN = '';\nvar CI_ItemShipWeight = '22';\n\nvar bcData = new Object();";
preg_match_all($re, $str, $matches);