Search code examples
phpxpathmultidimensional-arrayxml-parsingsimplexml

Parse XML and concatenate divided URL with SimpleXML and XPath


I've got an XML snippet and it has a section where an URL is segmented as shown below. I can successfully parse it using SimpleXML and XPath. My XPath queries return me an array with the results, and then I can echo them. No probs! =)

However I want to do it a step further trying to reduce code lines and improve my code. First take a look at my code:

<?php

$xml_g='
<files>
    <file>
        <filename>itzafile</filename>
        <ext>.tar.gz</ext>
        <url protocol="http://">itzanexample.net/folder/subfolder/</url>
    </file>
    <file>
        <filename>itzavideo</filename>
        <ext>.mp4</ext>
        <url protocol="ftp://">itzanotherurl.com/videos/</url>
    </file>
</files>
';


function URLparts($xml) {
$xmlData= simplexml_load_string("$xml");

$protocol = $xmlData->xpath('//file/url/@protocol');
$url = $xmlData->xpath('//file/url');
$filename = $xmlData->xpath('//file/filename');
$ext = $xmlData->xpath('//file/ext');

echo 'The following it\'s echoed calling 4 different arrays:'."\n\t".$protocol[0], $url[0], $filename[0], $ext[0]."\n\t".$protocol[1], $url[1], $filename[1], $ext[1]."\n\n"; //prints the entire url! Right!


//These variables are arrays, so theoretically it must be possible to create an array of arrays here:
$completeURL = array($protocol,$url,$filename,$ext);
//I've also tried this but it's just the same problem:
//$completeURL = array(array($protocol),array($url),array($filename),array($ext));

echo 'The following should be echoed with a two-dimensional array, but something fails:'."\n\t".$completeURL[0][0][0][0]."\n\n"; //Just prints "http://" WRONG! u.u
/*
 * $completeURL[1][0][0][0] prints the domain+subfolder
 * $completeURL[0][1][0][0] prints ftp://
 * $completeURL[0][0][1][0] prints nothing...
 */

}

URLparts($xml_g);
?>

As you can see, I want to avoid joining the URL as: echo $protocol[0], $url[0], $filename[0], $ext[0] and I want to do it simpler with fewer variables as in completeURL[0][0][0][0] (first file node with all it's corresponding parts), completeURL[1][1][1][1] (second file node with all the url parts) etc...

But obviously I'm doing something wrong. Where is the error? It should be related to the multidimensional array I'm trying to create.


Solution

  • If done this way:

    $completeURL = $xmlData->xpath('//file/url/@protocol | //file/url | //file/filename');
    echo $completeURL[2].$completeURL[1].$completeURL[0]."\n";
    echo $completeURL[5].$completeURL[4].$completeURL[3]."\n";
    

    The echo command prints this:

    http://itzanexample.net/folder/subfolder/itzafile.tar.gz
    ftp://itzanotherurl.com/videos/itzavideo.mp4
    

    The use of the union operator | gets the intended result using PHP5 + SimpleXML + XPath !! HOWEVER I don't understand WHY the result must be echoed backwards (2, 1, 0; 5, 4, 3) instead of 0, 1, 2, 3, etc.

    Also I don't get WHY the following expression gives the same result echoing the same way:

    $completeURL = $xmlData->xpath('//file/filename | //file/url | //file/url/@protocol');
    

    This also solves the question but it remains unknown for me why these 2 expressions return the same. So the order doesn't matters? Why it's needed to echo backwards??

    So this is an answer indeed but need to be explained. Somebody knows why?? THANKS! =)