Search code examples
xmlxmlstarlet

xmlstarlet append to a node in XML


I have an xml file with the following entries:

....
  <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
    <analyzer type="index">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
    <analyzer type="query">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.SynonymGraphFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>
 ....

I would like to inject the following XML node in <analyzer type="index">:

<filter class="solr.NGramFilterFactory" minGramSize="1" maxGramSize="20"/>

So, the final expected XML looks like so:

....
  <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100" multiValued="true">
    <analyzer type="index">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    <filter class="solr.NGramFilterFactory" minGramSize="1" maxGramSize="20"/>
    </analyzer>
    <analyzer type="query">
      <tokenizer class="solr.StandardTokenizerFactory"/>
      <filter class="solr.StopFilterFactory" words="stopwords.txt" ignoreCase="true"/>
      <filter class="solr.SynonymGraphFilterFactory" expand="true" ignoreCase="true" synonyms="synonyms.txt"/>
      <filter class="solr.LowerCaseFilterFactory"/>
    </analyzer>
  </fieldType>
 ....

To this end, I have tried using xmlstarlet like so:

xmlstarlet ed --inplace -s "//fieldType" -t elem -n "text_general" -i "//filter" -t attr -n "class" -v ""solr.NGramFilterFactory" minGramSize="1" maxGramSize="20"" <file_name_here>

but obviously, this does not work (it really messes my XML file when I run this!). I am quite new to xmlstarlet and having difficulties with the correct syntax to achieve this goal. I also think there is a problem with quoting in my attempt.


Solution

  • You should be able to do this by creating a new filter element and then adding the attributes to it (the new filter is now the last filter element in analyzer)...

    xmlstarlet ed --inplace -s '//analyzer[@type="index"]' -t elem -n filter -i '//analyzer[@type="index"]/filter[last()]' -t attr -n class -v solr.NGramFilterFactory -i '//analyzer[@type="index"]/filter[last()]' -t attr -n minGramSize -v 1 -i '//analyzer[@type="index"]/filter[last()]' -t attr -n maxGramSize -v 20 input.xml
    

    Another option is to use XSLT. I think it's a lot easier than trying to do everything from the command line...

    xmlstarlet tr so.xsl input.xml > output.xml
    

    XSLT 1.0 (so.xsl)

    <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
      <xsl:output indent="yes"/>
      <xsl:strip-space elements="*"/>
    
      <xsl:template match="@*|node()">
        <xsl:copy>
          <xsl:apply-templates select="@*|node()"/>
        </xsl:copy>
      </xsl:template>
    
      <xsl:template match="analyzer[@type='index']">
        <xsl:copy>
          <xsl:apply-templates select="@*|node()"/>
          <filter class="solr.NGramFilterFactory" minGramSize="1" maxGramSize="20"/>
        </xsl:copy>
      </xsl:template>
      
    </xsl:stylesheet>