Search code examples
ibm-watsonwatson-explorer

How do I index XML data within a content node?


I'm converting an XML document and want to dump the entire contents in a content node in the converter.

<xsl:template match="/">
  <vce>
    <document>
      <content name="xml">
        <xsl:copy-of select="." />
      </content>
    </document>
  </vce>
</xsl:template>

This gives me a node with the name "XML" and my entire xml content within. However, this is removed when the normalization converter is run. Is there something special I need to do to index XML inside a content?


Solution

  • I was able to reference the converter: 'vse-converter-xml-to-vxml' to create a template that indexes the xml:

    <xsl:template match="/">
    
      <vce>
        <document>
          <content name="xml">
            <xsl:apply-templates select="*" mode="xml-to-plain-text" />
          </content>
        </document>
      </vce>
    </xsl:template>
    
    <xsl:template match="*" mode="xml-to-plain-text">
      <xsl:text><![CDATA[<]]></xsl:text>
      <xsl:value-of select="name()" />
      <xsl:text> </xsl:text>
      <xsl:choose>
        <xsl:when test="text()|*|comment()">
          <xsl:text>></xsl:text>
          <xsl:apply-templates select="text()|*|comment()" mode="xml-to-plain-text" />
          <xsl:text><![CDATA[</]]></xsl:text>
          <xsl:value-of select="name()" />
          <xsl:text>></xsl:text>
        </xsl:when>
        <xsl:otherwise>
          <xsl:text>/></xsl:text>
        </xsl:otherwise>
      </xsl:choose>
    </xsl:template>