Search code examples
regexperlpattern-matching

Regex in xml not working as expected


I am novice here and I am trying to search and replace a string in a xml file which is as given below:

<name>xxx.yyy_zzz</name>
      <constructor_arguments />
      <parameters>
        <parameter>
          <name>Name</name>
          <string>
            <value>yyy</value>
          </string>
        </parameter>
        <parameter>
          <name>abc</name>
          <bool>
            <value>false</value>
          </bool>
        </parameter>
        <parameter>
          <name>abcd</name>
          <bool>
            <value>true</value>
          </bool>
        </parameter>
        <parameter>
          <name>aa</name>
          <integer>
            <value>10</value>
          </integer>
        </parameter>
        <parameter>
          <name>bb</name>
          <integer>
            <value>100</value>
          </integer>
        </parameter>
        <parameter>
          <name>runtime_disabled</name>
          <bool>
            <value>false</value>
          </bool>

I have tried the following to change runtime_disabled value to true but it did not happen which I expect to happen, can anyone tell why is it so and provide suitable solution for the same to work

$data=~ s/(xxx\.i\yyy\_zzz\s*?<\/name>(.+?)runtime_disabled\s*?<\/name>\s*?<bool>\s*?<value>\s*?<value>.*?<\/value>)/$1true$2/g;

Solution

  • Parsing well-formed XML should practically always be done using modules. Much has been said about that over time, see for example this page and this page, among many others. I also hear that truly bad things may come to pass otherwise.

    Here is an example of how to do what you ask using XML::libXML. Another excellent module is XML::Twig. I completed your sample so to make it a well-formed XML file (shown below code).

    use strict 'all';
    use warnings;
    
    use XML::LibXML;    
    
    my $file = 'file.xml';
    
    my $doc = XML::LibXML->load_xml(location => $file, no_blanks => 1); 
    
    # Form the XPath expression used to find the 'name' nodes
    my $xpath = '//doc/constructor_arguments/parameters/parameter/name';
    
    foreach my $node ($doc->findnodes($xpath)) 
    {
        # Select the particular 'name' tag that is needed
        if ($node->to_literal eq 'runtime_disabled') 
        {   
            # Query the parent node, for <name>'s sibling <bool>
            foreach my $bval ($node->parentNode->findnodes('./bool/value/text()')) 
            {
                my $content = $bval->toString;
                $content =~ s/false/true/;
                $bval->setData($content);
            }
        }   
    }
    
    # Write a new file with the change
    $doc->toFile('changed_' . $file, 1); 
    

    This is meant to be basic, there are quite a few other ways to do the same.

    The posted sample was padded as follows

    <doc>
    ... posted text ...
                </parameter>
            </parameters>
        </constructor_arguments>
    </doc>