Search code examples

Perl: How to consider next XML tag as child tag of previous one?

In following data file, I want to consider each <Field> tag as child tag of <Register> and each <Register> as child of <Partition>. so, basically, I am trying to extract each <Partition> details with corresponding <Register> and <Field>. Since all these tags are separate and not as child-parent relationship, how can I get my desired output?

Since the file is very large, I do not want to make it as child-parent relationship as it will require find/replace and manual intervention.












I am using XML::Twig package and here is my code snippet:

foreach my $register ( $twig->get_xpath('//Register') ) # get each <Register>
        #print $register, "\n";
        my $reg_name = $register->first_child('Name')->text;
        my $reg_abstract= $register->first_child('Abstract')->text;
        my $reg_description= $register->first_child('Description')->text;
          foreach my $xml_field ($register->get_xpath('Field'))
            my $reg_field_name= $xml_field->first_child('Name')->text;
            my $reg_field_abstract= $xml_field->first_child('Abstract')->text;
            #print "$reg_field_name \n";



  • As per your comment, if you want to rewrite the file with Register and Field elements as children of Partition elements, here is what you could do:

    simplest solution, the whole file is loaded in memory:

    #!/usr/bin/env perl
    use strict;
    use warnings;
    use XML::Twig;
    my $test_file= 'test.xml';
    XML::Twig->new( twig_handlers => { 'Register|Field' => \&child,
                    pretty_print => 'indented',
              ->parsefile( $test_file)
    sub child
      { my( $t, $child)= @_;
        $child->move( last_child => $child->prev_sibling( 'Partition'));

    Since you mentioned that the file can be very large, below is a slightly more complex version that only keeps in memory 2 Partition elements (including the new children of the first one). When a Partition is parsed it uses flush_up_to to flush the tree, up to the previous Partition:

    #!/usr/bin/env perl
    use strict;
    use warnings;
    use XML::Twig;
    my $test_file= 'test.xml';
    XML::Twig->new( twig_handlers => { 'Partition' => \&parent,
                                       'Register|Field' => \&child,
                    pretty_print => 'indented',
              ->parsefile( $test_file);
    sub child
      { my( $t, $child)= @_;
        $child->move( last_child => $child->prev_sibling( 'Partition'));
    sub parent
      { my( $t, $partition)= @_;
        if( my $prev_partition = $partition->prev_sibling( 'Partition'))
          { $t->flush_up_to( $prev_partition); }

    Note that since flush_up_to is used, at the end of the parsing the rest of the tree is automatically flushed

    If you need to write the XML to a specific file, instead of STDOUT, you can also pass a filehandle to flush_up_to.