I have $html
:
$html = '
<table id="myTable">
<tbody>
<tr>
<td>08/20/18</td>
<td> <a href="https://example.com/1a">Text 1 A</a> </td>
<td> <a href="https://example.com/1b">Test 1 B</a> </td>
</tr>
<tr>
<td>08/21/18</td>
<td> <a href="https://example.com/2a">Text 2 A</a> </td>
<td> <a href="https://example.com/2b">Test 2 B</a> </td>
</tr>
</tbody>
</table>
';
Using DOMDocument, I want to add the content of the table into a multidimensional $array
:
$array = array(
// tr 1
array(
array(
'content' => '08/20/18'
),
array(
'content' => 'Text 1 A',
'href' => 'https://example.com/1a'
),
array(
'content' => 'Text 1 B',
'href' => 'https://example.com/1b'
)
),
// tr 2
array(
array(
'content' => '08/21/18'
),
array(
'content' => 'Text 2 A',
'href' => 'https://example.com/1a'
),
array(
'content' => 'Text 2 B',
'href' => 'https://example.com/1b'
)
)
);
I've managed to get the content of the table
using xpath
:
// setup DOMDocument
$doc = new DOMDocument();
$doc->loadHTML('<?xml encoding="utf-8" ?>' . $html);
$xpath = new DOMXPath($doc);
// target table using xpath
$results = $xpath->query("//*[@id='myTable']");
if ($results->length > 0) {
var_dump($results->item(0));
var_dump($results->item(0)->nodeValue);
}
Test it. What is the approach to put the content of each tr
into the $array
?
<?php
$html = '
<table id="myTable">
<tbody>
<tr>
<td>08/20/18</td>
<td> <a href="https://example.com/1a">Text 1 A</a> </td>
<td> <a href="https://example.com/1b">Test 1 B</a> </td>
</tr>
<tr>
<td>08/21/18</td>
<td> <a href="https://example.com/2a">Text 2 A</a> </td>
<td> <a href="https://example.com/2b">Test 2 B</a> </td>
</tr>
</tbody>
</table>
';
$data = [];
$doc = new DOMDocument();
$doc->loadHTML('<?xml encoding="utf-8" ?>' . $html);
$xpath = new DOMXPath($doc);
$trs = $xpath->query("//*[@id='myTable']/tbody/tr");
foreach ($trs as $i => $tr) {
/** @var DOMElement $td */
foreach ($tr->childNodes as $td) {
if ($td instanceof DOMElement) {
/** @var DOMElement $a */
$row = [];
foreach ($td->childNodes as $a) {
/** @var DOMAttr $attribute */
$row['content'] = $td->nodeValue;
if ($a->hasAttributes()) {
foreach ($a->attributes as $attribute) {
$row[$attribute->name] = $attribute->value;
}
}
}
$data[$i][] = $row;
}
}
}
var_dump($data);