Search code examples
perlxml-twig

Perl: Inserting an XML::Twig node with XML::Twig


I am comparing two XML files. If I find a node missing in one of the files I want to insert it into the other file. Here is how I have been trying it:

my $out_file = 'fbCI_report.xml';
open my $fh_out, '>>', $out_file or die "Can't open $out_file for writing: $!";

my $currentReport = XML::Twig->new( pretty_print => 'indented' );
$currentReport->parsefile($path_to_currentReport);
print "Loaded current report.\n";

my $newReport = XML::Twig->new( pretty_print => 'indented' );
$newReport->parsefile($path_to_newReport);
print "Loaded new report.\n";

my $currentRoot   = $currentReport->root;             # get the root
my $currentBuilds = $currentRoot->first_child();      # get the builds node
my $currentXCR    = $currentBuilds->first_child();    # get the xcr node

my $newRoot   = $newReport->root;                     # get the root
my $newBuilds = $newRoot->first_child();              # get the builds node
my $newXCR    = $newBuilds->first_child();            # get the xcr node

my @currentXCRarray = $currentBuilds->children('xcr');
my @newXCRarray     = $newBuilds->children('xcr');
my $numberOfxcr     = $newBuilds->children_count();

foreach my $currentXCRmod ( @currentXCRarray ) {

    my $currentID = $currentXCRmod->att("id");

    foreach my $newXCRmod (@newXCRarray) {

        my $newID = $newXCRmod->att("id");

        if ( $newID == $currentID ) {
            last;
        }
        elsif ( $count == $numberOfxcr && $newID != $currentID ) {
            my $insert = $currentBuilds->insert_new_elt($newXCRmod);
            print "XCR does not exist in current report, adding it..\n";
        }

        $count++;
    }
}

print $fh_out $currentReport->sprint();
close $fh_out;

However this does not insert the node with the corresponding children but what I guess is the reference to the node: <XML::Twig::Elt=HASH(0x326efe0)/>. Is there a way to insert the node properly? I have yet to find anything on the CPAN site.

Sample data, current.xml:

<project>
  <builds>
    <xcr id="13367" buildable="false">
        <artifact name="rb"/>
        <artifact name="syca"/>
    </xcr>
    <xcr id="13826" buildable="false">
        <artifact name="dcs"/>
    </xcr>
  <\builds>
<\project>

new.xml:

<project>
<builds>
    <xcr id="13367" buildable="false">
        <artifact name="rb"/>
        <artifact name="syca"/>
    </xcr>
    <xcr id="13826" buildable="false">
        <artifact name="dcs"/>
    </xcr>
    <xcr id="10867" buildable="true">
        <artifact name="smth"/>
        <artifact name="top"/>
        <artifact name="tree"/>
    </xcr>
<\builds>
<\project>

Solution

  • You're right - that's the stringified text of an XML::Twig::Elt.

    The problem is - insert_new_elt creates a new element. So what you're doing is effectively, "printing" the element id ( XML::Twig::Elt=HASH(0x326efe0)) and creating a new node called that.

    But you don't want to do that - you're wanting to copy an existing one.

    So I would suggest what you want to do is:

    my $copied_elt = $currentXCRmod -> copy;
    $copied_elt -> paste ( last_child => $currentBuilds );
    

    Which will transfer the element (into the 'last_child' position).

    Although I'd suggest that your loop is perhaps something you could improve on too - I would suggest you look at a twig_handler, to check which ID's exist in the file at parse:

    my %seen_id; 
    sub collect_ids {
       my ( $twig, $element ) = @_;
       $seen_id { $element->att('id') } ++; 
    } 
    

    And then call this at parse time:

    my $currentReport = XML::Twig->new(twig_handlers => { 'xcr' => \&collect_ids}, 
                                       pretty_print=>'indented');
    $currentReport->parsefile($path_to_currentReport);
    

    And this will let you easily compare/copy which ones do or don't exist.

    Or alternatively (based on your XML sample so far):

    #!/usr/bin/env perl
    
    use strict;
    use warnings 'all';
    
    use Data::Dumper;
    use XML::Twig;
    
    my $current = XML::Twig -> new ( ) -> parsefile ('test1.xml');
    my $new = XML::Twig -> new (  ) -> parsefile ( 'test2.xml'); 
    
    my $cur_builds = $current -> root -> get_xpath('./builds',0);
    
    foreach my $xcr ( $new -> findnodes('//xcr') ) {
       my $id = $xcr -> att('id'); 
       if ( not $current -> findnodes("//xcr[\@id=\"$id\"]") ) {
          print "$id not in current, copying\n"; 
          my $copy = $xcr -> copy; 
          $copy -> paste ( last_child => $cur_builds ); 
       }
    }
    
    $current -> set_pretty_print('indented_a');
    $current -> print;