I am trying to add the option
tag from a ppa, as to show on my own website the current Ubuntu distro's it support.
This is what i have so far.
<?php include 'https://launchpad.net/~gregory-hainaut/+archive/pcsx2.official.ppa#field.series'; ?>
this is the content of the option
tag:
<select onchange="updateSeries(this);" size="1" name="field.series" id="field.series">
<option value="YOUR_UBUNTU_VERSION_HERE" selected="selected">Choose your Ubuntu version</option>
<option value="trusty">Trusty (14.04)</option>
<option value="saucy">Saucy (13.10)</option>
<option value="raring">Raring (13.04)</option>
<option value="quantal">Quantal (12.10)</option>
<option value="precise">Precise (12.04)</option>
<option value="lucid">Lucid (10.04)</option>
</select>
You'll want to use file_get_contents
(or curl
) to get the actual source code, and the parse it using (for instance) DOMDocument
. For instance, you might try something like this:
Fist define a function that lets us get the innerHTML of a node (thank you to Hiam):
function DOMinnerHTML(DOMNode $element)
{
$innerHTML = "";
$children = $element->childNodes;
foreach ($children as $child)
{
$innerHTML .= $element->ownerDocument->saveHTML($child);
}
return $innerHTML;
}
Then get the contents and grab the select menu by ID
$ppa = file_get_contents('https://launchpad.net/~gregory-hainaut/+archive/pcsx2.official.ppa#field.series');
$doc = new DomDocument;
$doc->loadHTML($ppa);
$innerHTML = DOMinnerHTML($doc->getElementById('field.series_filter'));
If I echo $innerHTML
the result is:
<option value="">Any series</option>
<option value="trusty">Trusty</option>
<option value="saucy">Saucy</option>
<option value="raring">Raring</option>
<option value="quantal">Quantal</option>
<option value="precise">Precise</option>
<option value="lucid">Lucid</option>
From that result, you'll notice I only got the inner html of the select menu, so you'll want to wrap the returned innerHTML in a select
tag:
<select name='[your name]' [...other properties]>
<?=$innerHTML;?>
</select>
The OP mentioned in comments that everything is working, but he wants a different select menu than the one selected above. Because the page he is scraping has invalid markup (two separate select
menus have the same id
) the DOMDocument
getElementById
call is not fetching the correct node. To correct this you have to look at the DOM tree and fine a unique parent element so that you can first grab the parent node and then run a query to find the item you're looking for In this case, the menu that the OP wants is inside of a div
with an ID of "", so all we do is grab that node, and then use getElementsByTagName
to grab the select menu:
//... get the code (set in $ppa) as per the original section
$doc->loadHTML($ppa);
//Grab the parent div because it has a unique ID
$parent_div = $doc->getElementById('series-widget-div');
//then seach for all <select> tags and grab the first one
$select_menu = $parent_div->getElementsByTagName('select')->item(0);
//... and now we're ready to get the innerHTML
$innerHTML = DOMinnerHTML( $select_menu );
//echo it out!
echo $innerHTML;