Search code examples
htmlcssxsltsaxonapache-fop

XSLT - Convert images (and pdf) to base64


I use Apache FOP 2.8 to transform Apache FOP Intermediate Format (IF) file into a HTML-file with a self written xslt stylesheet.

As external libraries i have currently only saxon12he installed.

The Problem #1 (image to base64)

In the source IF document there are image xml Elements that look like this:

<image xlink:href="files\Logo.png"/>

It would be easy to convert this to HTML and get an output like

<img src="files\Logo.png"/>

when using a template like:

<xsl:template match="image">
    <xsl:variable name="file-path"><xsl:value-of select="@xlink:href"/></xsl:variable>
    <img src="{$file-path}"/>
</xsl:template>

The Problem here is that the generated HTML-file can't be 'standalone'... means that besides the HTML-file there has to be the files directory with a Logo.png inside so that the HTML-file finds the imagepath files\Logo.png

But what i want to achieve is that the HTML-file is 'standalone'.

So is there a way to convert Logo.png to Base64 maybe in a simple function call like:

<xsl:template match="image">
    <xsl:variable name="file-path"><xsl:value-of select="@xlink:href"/></xsl:variable>
    <img src="to-base64($file-path)"/>
</xsl:template>

to create an output like:

<img src="...."/>

The Problem #2 (pdf to base64)

So another tricky part is, that in the intermediate format, the xlink:href can also lead to a .pdf file...

<image xlink:href="files\Table_1234.pdf"/>

It would be great if there is also the possibility to transform this into a base64 image in the same way like above.

Or maybe there is also another way to achieve the HTML document becomes 'standalone', but converting to base64 was the only idea i got so far.

Approach 1 (Saxon Java extension function)

I tried creating a Java extension function for Saxon 12 HE following this documentation

So I've implemented a ExtensionFunctionDefinition

import net.sf.saxon.expr.XPathContext;
import net.sf.saxon.lib.ExtensionFunctionCall;
import net.sf.saxon.lib.ExtensionFunctionDefinition;
import net.sf.saxon.om.Sequence;
import net.sf.saxon.om.StructuredQName;
import net.sf.saxon.trans.XPathException;
import net.sf.saxon.value.SequenceType;
import net.sf.saxon.value.StringValue;

public class ImageToBase64 extends ExtensionFunctionDefinition {
    @Override
    public StructuredQName getFunctionQName() {
        return new StructuredQName("ext", "http://example.com/saxon-extension", "imageToBase64");
    }

    @Override
    public SequenceType[] getArgumentTypes() {
        return new SequenceType[]{SequenceType.SINGLE_STRING};
    }

    @Override
    public SequenceType getResultType(SequenceType[] suppliedArgumentTypes) {
        return SequenceType.SINGLE_STRING;
    }

    @Override
    public ExtensionFunctionCall makeCallExpression() {
        return new ExtensionFunctionCall() {
            @Override
            public Sequence call(XPathContext context, Sequence[] arguments) throws XPathException {
                var filePath = ((StringValue)arguments[0]).getStringValue();
                // open file and convert to base64 string
                var resultBase64 = "12345";
                return StringValue.makeStringValue(resultBase64);
            }
        };
    }
}

Because the documentation says: "the classes that implement these extension functions must be registered with the Configuration" and this can be "achieved by subclassing net.sf.saxon.Transform or net.sf.saxon.Query, overriding the method applyLocalOptions() so that it makes the appropriate calls on config.registerExtensionFunction();" I also added a class that extends net.sf.saxon.Transform:

import net.sf.saxon.Transform;
import net.sf.saxon.trans.CommandLineOptions;

public class Configuration extends Transform {
    @Override
    protected void applyLocalOptions(CommandLineOptions options, net.sf.saxon.Configuration config) {
        config.registerExtensionFunction(new ImageToBase64());
        super.applyLocalOptions(options, config);
    }
}

When i build the artifacts to get the jar file (i use IntelliJ btw.) i only added the "compile output" so the jar is 3kb in the end.

Then i dropped the jar into the lib folder next to saxon-he-12.2.jar of Apache FOP and added xmlns:ext="http://example.com/saxon-extension" to the xsl:stylesheet.

But when i now call

<xsl:value-of select="ext:imageToBase64('my/file/path')"/>

I get the error net.sf.saxon.trans.XPathException: Cannot find a 1-argument function named Q{http://example.com/saxon-extension}imageToBase64()


Solution

  • I got this to work with the help of @MartinHonnen who told me to create my own extension function.

    So i created a new java program (it's important to use Java 8) and added two classes:

    package ExtensionsPackage;
    
    import net.sf.saxon.expr.XPathContext;
    import net.sf.saxon.lib.ExtensionFunctionCall;
    import net.sf.saxon.lib.ExtensionFunctionDefinition;
    import net.sf.saxon.om.Sequence;
    import net.sf.saxon.om.StructuredQName;
    import net.sf.saxon.trans.XPathException;
    import net.sf.saxon.value.SequenceType;
    import net.sf.saxon.value.StringValue;
    
    public class ImageToBase64 extends ExtensionFunctionDefinition {
        @Override
        public StructuredQName getFunctionQName() {
            return new StructuredQName("ext", "http://example.com/saxon-extension", "imageToBase64");
        }
    
        @Override
        public SequenceType[] getArgumentTypes() {
            return new SequenceType[]{SequenceType.SINGLE_STRING};
        }
    
        @Override
        public SequenceType getResultType(SequenceType[] suppliedArgumentTypes) {
            return SequenceType.SINGLE_STRING;
        }
    
        @Override
        public ExtensionFunctionCall makeCallExpression() {
            return new ExtensionFunctionCall() {
                @Override
                public Sequence call(XPathContext context, Sequence[] arguments) throws XPathException {
                    String filePath = ((StringValue)arguments[0]).getStringValue();
                    // open file and convert to base64 string
                    String resultBase64 = "12345";
                    return StringValue.makeStringValue(resultBase64);
                }
            };
        }
    }
    

    and according to this stackoverflow-entry another class MyTransformerFactory:

    package ExtensionsPackage;
    
    import net.sf.saxon.Configuration;
    import net.sf.saxon.TransformerFactoryImpl;
    import net.sf.saxon.lib.ExtensionFunctionDefinition;
    
    public class MyTransformerFactory extends TransformerFactoryImpl {
        public MyTransformerFactory() {
            super();
            ExtensionFunctionDefinition imageToBase64Function = new ImageToBase64();
            this.getProcessor().registerExtensionFunction(imageToBase64Function);
        }
    
        public MyTransformerFactory(Configuration config) {
            super(config);
            ExtensionFunctionDefinition imageToBase64Function = new ImageToBase64();
            this.getProcessor().registerExtensionFunction(imageToBase64Function);
        }
    }
    

    Now build a jar file and throw it into the lib folder of Apache FOP.

    Then add set CUSTOMOPTS=-Djavax.xml.transform.TransformerFactory=ExtensionsPackage.MyTransformerFactory to the fop.bat and add %CUSTOMOPTS% to :runFop.

    Add the namespace to your stylesheet:

    <xsl:stylesheet version="1.0" 
        xmlns:ext="http://example.com/saxon-extension">
    

    and use it like:

    <xsl:value-of select="ext:imageToBase64('my/file/path')"/>
    

    If fop.bat now gets executed via console xsl:value-of will deliver 12345.