Search code examples
phpdom-manipulation

Split big XML file using php


Im trying to group a big file into each individual file (group on their position index )

Sample XML:

<team>
    <player>
        <fname>Jon</fname>
        <lname>Doe</lname>
        <active>Y</active>
        <position>
            <name>forward</name>
            <salary>10 000</salary>
        </position>
        <position>
            <name>center</name>
            <salary>30 000</salary>
        </position> 
        <position>
            <name>power forward</name>
            <salary>15 000</salary>
        </position>         
        <contract>
            <position>forward</position>
            <type>common</type>
        </contract> 
        <contract>
            <position>center</position>
            <type>special</type>
        </contract>
        <contract>
            <position>power forward</position>
            <type>special</type>
        </contract>
        <address>common street</address>
        <age>25</age>       
    </player>
    <player>
        <fname>Average</fname>
        <lname>Joe</lname>
        <active>Y</active>
        <position>
            <name>shootingguard</name>
            <salary>25 000</salary>
        </position>
        <position>
            <name>sixth man</name>
            <salary>22 000</salary>
        </position> 
        <contract>
            <position>sixth man</position>
            <type>extra</type>
        </contract> 
        <contract>
            <position>shootingguard</position>
            <type>common</type>
        </contract> 
        <address>common random street</address>
        <age>21</age>       
    </player>
</team>

Expected output: firstposition.xml

<team>
    <player>
        <fname>Jon</fname>
        <lname>Doe</lname>
        <active>Y</active>
        <position>
            <name>forward</name>
            <salary>10 000</salary>
        </position>
        <contract>
            <position>forward</position>
            <type>common</type>
        </contract> 
        <address>common street</address>
        <age>25</age>       
    </player>   
    <player>
        <fname>Average</fname>
        <lname>Joe</lname>
        <active>Y</active>
        <position>
            <name>shootingguard</name>
            <salary>25 000</salary>
        </position>
        <contract>
            <position>shootingguard</position>
            <type>common</type>
        </contract>
        <address>common random street</address>
        <age>21</age>   
    </player>
</team>

secondposition.xml

<team>
    <player>
        <fname>Jon</fname>
        <lname>Doe</lname>
        <active>Y</active>
        <position>
            <name>center</name>
            <salary>30 000</salary>
        </position>
        <contract>
            <position>center</position>
            <type>special</type>
        </contract> 
        <address>common street</address>
        <age>25</age>       
    </player>
    <player>
        <fname>Average</fname>
        <lname>Joe</lname>
        <active>Y</active>
        <position>
            <name>sixth man</name>
            <salary>22 000</salary>
        </position> 
        <contract>
            <position>sixth man</position>
            <type>extra</type>
        </contract> 
        <address>common random street</address>
        <age>21</age>   
    </player>
</team>

thirdposition.xml

<team>
    <player>
        <fname>Jon</fname>
        <lname>Doe</lname>
        <active>Y</active>
        <position>
            <name>powerforward</name>
            <salary>15 000</salary>
        </position>
        <contract>
            <position>powerforward</position>
            <type>average</type>
        </contract> 
        <address>common street</address>
        <age>25</age>       
    </player>
</team>

Where am I on this situation? I can split the file and manipulate them as needed but I wasn't able to achieve my desired result.

First Im getting all positions and contract like this

$positions = $player->getElementsByTagName("position");
$contracts = $player->getElementsByTagName("contract");

then I iterate into each position ( just getting the position name value and pushing it into another array)

foreach($positions as $pos){
array_push ($tempPost,$pos->getElementsByTagName("name")->item(0)->nodeValue); }

then iterate again into the $positions now comparing them and remove unnecessary tags as needed

for ($i = 0; $i < $positions->length; $i++) {
    $c_post = $positions->item($i)
    $c_cont = $contracts ->item($i)
    #first position remove other position
        if($c_post->getElementsByTagName("name")->item(0)->nodeValue != tempPost[0]){
            $c_post->parentNode->removeChild($c_post);
            $i--;            
    }
}

This works fine I was able to remove non-first position the problem I have is on removing the contract based on the position name. If I remove it via $c_cont = $contracts ->item($i) this obviously remove random contract (depends on iteration ) my goal is to look up on the contract based on the position name and remove those.

Now, really my final question is. How can I look up inside $contracts with criteria?

the idea I have in mind is

c_cont = the result of  ( select from $contracts where position = $c_post->getElementsByTagName("name")->item(0)->nodeValue )

this visualization using SQL terms but not sure how can I achieve that on PHP.

If I got it I can just delete it like the way above $c_cont->parentNode->removeChild($c_cont);

Sorry for the long post and thanks in advance for the help.


Solution

  • In the end I just compare the values with 2 loops, not sure if this is the most efficient but well it solved my problem.