Search code examples
htmlxslttransformsaxon

Problem with self-closing DIV tags with XRechnung Visualizer and Saxon-HE for .NET


I want to implement a XRechnung visualizer with .NET/C# by using itplr-kosit's xrechnung-visualization, to transform XRechnung to HTML. As processor I use Saxonica's Saxon-HE. Now I'm struggeling with invalid HTML as output in the form of self-closing DIV's.

The transformation code is as follows:

public static string TransformXml(string xmlData, string xslData)
{
    var xsltProcessor = new Saxon.Api.Processor();

    var documentBuilder = xsltProcessor.NewDocumentBuilder();
    documentBuilder.BaseUri = new Uri("file://");
    var xdmNode = documentBuilder.Build(new StringReader(xmlData));

    var xsltCompiler = xsltProcessor.NewXsltCompiler();
    var xsltExecutable = xsltCompiler.Compile(new StringReader(xslData));
    var xsltTransformer = xsltExecutable.Load();
    xsltTransformer.InitialContextNode = xdmNode;
    
    var results = new Saxon.Api.XdmDestination();
    xsltTransformer.Run(results);

    return results.XdmNode.OuterXml;
}

    

The calls:

var xmlData = File.ReadAllText(Path.Combine(Directory.GetCurrentDirectory(), "xrechnung.xml"));
var xslDataToXR = File.ReadAllText(Path.Combine(Directory.GetCurrentDirectory(), "cii-xr.xsl"));
var xslDataToHTML = File.ReadAllText(Path.Combine(Directory.GetCurrentDirectory(), "xrechnung-html.xsl"));

var xrXMLData = TransformXml(xmlData, xslDataToXR);
var htmlData = TransformXml(xrXMLData, xslDataToHTML);

File.WriteAllText(Path.Combine(Directory.GetCurrentDirectory(), "result.html"), htmlData);

Works ... up to the problem, that in the resulting HTML all fields which are not filled are transformed to self-closing DIV tags.

For example, the following snippet out of the xrechnung-html.xsl...

<div class="boxzeile">
  <div class="boxdaten legende">Postfach:</div>
  <div id="BT-51" title="BT-51" class="boxdaten wert"><xsl:value-of select="xr:BUYER_POSTAL_ADDRESS/xr:Buyer_address_line_2"/></div>
</div>

... will be rendered as following HTML, because the xml doesn't provide a value for Buyer_address_line_2:

<div class="boxzeile">
  <div class="boxdaten legende">Postfach:</div>
  <div id="BT-51" title="BT-51" class="boxdaten wert"/>
</div>

The browser interprets the self-closing DIV as open tag and the complete view is broken.

Any ideas?


Solution

  • If you let Saxon do the serialization by not using an XdmDestination but by writing directly to a file or stream or string writer using a Serializer then I am sure it honours HTML serialization rules. In the context of XML and XSLT I would recommend to let input parsing and output serialization be handled as much as possible by the XML parser and XSLT processor instead of reading in strings from files or writing strings to file with File APIs.

    As you seem to want to chain two transformations I guess using

      var xslt1 = xsltExecutable1.Load30();
      var xslt2 = xsltExecutable2.Load30();
      
      using (var inputStream  = File.OpenRead(Path.Combine(Directory.GetCurrentDirectory(), "xrechnung.xml")) {
        using (var resultStream = File.OpenWrite(Path.Combine(Directory.GetCurrentDirectory(), "result.html") {
          xslt1.Transform(inputStream , xslt2.AsDocumentDestination(xslt2.NewSerializer(resultStream)));
        }
      }
    

    is a viable approach.

    Of course changing two stylesheets is also possible directly in XSLT 3:

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        exclude-result-prefixes="#all"
        version="3.0">
    
      <xsl:param name="step1-uri" as="xs:string">https://github.com/itplr-kosit/xrechnung-visualization/raw/master/src/xsl/ubl-invoice-xr.xsl</xsl:param>
      <xsl:param name="step2-uri" as="xs:string">https://github.com/itplr-kosit/xrechnung-visualization/raw/master/src/xsl/xrechnung-html.xsl</xsl:param>
    
      <xsl:output method="html" indent="yes" html-version="5"/>
    
      <xsl:template match="/">
          <xsl:sequence
             select="transform(map {
                       'source-node' : .,
                       'stylesheet-location' : $step1-uri
                     })?output ! transform(map { 
                                   'source-node' : ., 
                                   'stylesheet-location' : $step2-uri 
                                 })?output"/>
      </xsl:template>
      
    </xsl:stylesheet>
    

    In general one could use fold-left to chain a sequence of stylesheets:

    <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
        xmlns:xs="http://www.w3.org/2001/XMLSchema"
        exclude-result-prefixes="#all"
        version="3.0">
    
      <xsl:param name="step1-uri" as="xs:string">https://github.com/itplr-kosit/xrechnung-visualization/raw/master/src/xsl/ubl-invoice-xr.xsl</xsl:param>
      <xsl:param name="step2-uri" as="xs:string">https://github.com/itplr-kosit/xrechnung-visualization/raw/master/src/xsl/xrechnung-html.xsl</xsl:param>
    
      <xsl:param name="xslt-locations" as="xs:string*" select="$step1-uri, $step2-uri"/>
    
      <xsl:output method="html" indent="yes" html-version="5"/>
    
      <xsl:template match="/">
          <xsl:sequence
             select="fold-left(
                       $xslt-locations,
                       .,
                       function($doc, $xslt-location) { 
                         transform(map { 'source-node' : $doc, 'stylesheet-location' : $xslt-location })?output
                       }
                    )"/>
      </xsl:template>
      
    </xsl:stylesheet>