Search code examples

Convert HTML table data into transposed 2d array

I need to scrape the data from an HTML table and orientate the columnar data as rows of a 2d array.

My code does not display the correct structure.

HTML Table:



$DOM = new \DOMDocument();
$Header = $DOM->getElementsByTagName('tr')->item(0)->getElementsByTagName('td');
$Detail = $DOM->getElementsByTagName('td');
//#Get header name of the table
foreach($Header as $NodeHeader) 
    $aDataTableHeaderHTML[] = trim($NodeHeader->textContent);
//print_r($aDataTableHeaderHTML); die();

//#Get row data/detail table without header name as key
$i = 0;
$j = 0;

foreach($Detail as $sNodeDetail) 
    $aDataTableDetailHTML[$j][] = trim($sNodeDetail->textContent);
    $i = $i + 1;
    $j = $i % count($aDataTableHeaderHTML) == 0 ? $j + 1 : $j;
//print_r($aDataTableDetailHTML); die();

//#Get row data/detail table with header name as key and outer array index as row number
for($j = 0; $j < count($aDataTableHeaderHTML); $j++)
    for($i = 1; $i < count($aDataTableDetailHTML); $i++)

        $aTempData[][$aDataTableHeaderHTML[$j]][] = $aDataTableDetailHTML[$i][$j];

$aDataTableDetailHTML = $aTempData;
echo json_encode($aDataTableDetailHTML);

My result:


We need such a result:



  • I've changed a lot of the code to (hopefully) simplify it. This works in two stages, the first is to extract the <tr> elements and build up an array of all of the <td> elements in each row - storing the results into $rows.

    Secondly is to tie up the data vertically by looping across the first row and then using array_column() to extract the corresponding data from all of the rows...

    $trList = $DOM->getElementsByTagName("tr");
    $rows = [];
    foreach ( $trList as $tr )  {
        $row = [];
        foreach ( $tr->getElementsByTagName("td") as $td )  {
            $row[] = trim($td->textContent);
        $rows[] = $row;
    $aDataTableDetailHTML = [];
    foreach ( $rows[0] as $col => $value )  {
        $aDataTableDetailHTML[] = array_column($rows, $col);
    echo json_encode($aDataTableDetailHTML);

    Which with the test data gives...
