I'm at the proof-of-concept phase of building some DocBook → PDF transformation into a web application. The basic requirements are:
The TLDR is: How do I encapsulate the DocBook XSLT stylesheets in a JAR (that doesn't require exploding the JAR into files on the filesystem)?
As recently discussed on the docbook-apps mailing list, I can get quite a bit of the way by starting with the stylesheets in src/main/resources/xsl
(with some customisations at that level, and then the DocBook stylesheets in src/main/resources/xsl/docbook-xsl-1.79.2
), a catalog that starts like this:
<?xml version="1.0" encoding="utf-8"?>
<catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
<uri name="file:/xsl/juno-driver.xsl"
uri="classpath:/xsl/juno-driver.xsl" />
<uri name="file:/xsl/header-footer.xsl"
uri="classpath:/xsl/header-footer.xsl" />
<uri name="file:/xsl/table.xsl"
uri="classpath:/xsl/table.xsl" />
<uri name="file:/xsl/titlepage.xsl"
uri="classpath:/xsl/titlepage.xsl" />
<uri name="file:/xsl/docbook-xsl-1.79.2/fo/docbook.xsl"
uri="classpath:/xsl/docbook-xsl-1.79.2/fo/docbook.xsl" />
<uri name="file:/xsl/docbook-xsl-1.79.2/VERSION.xsl"
uri="classpath:/xsl/docbook-xsl-1.79.2/VERSION.xsl" />
<uri name="file:/xsl/docbook-xsl-1.79.2/fo/param.xsl"
uri="classpath:/xsl/docbook-xsl-1.79.2/fo/param.xsl" />
(and goes on to map every .xsl
, .xml
, .ent
, and .dtd
file to its classpath:
URI equivalent), and some code like this:
DOMResult result = new DOMResult();
TransformerFactory factory = TransformerFactory.newInstance();
InputStream is = XmlTest.class.getResourceAsStream("/xsl/juno-driver.xsl");
Source source = new StreamSource(is, "file:/xsl/juno-driver.xsl");
Transformer transformer = factory.newTransformer(source);
transformer.transform(new DOMSource(document), result);
return (Document) result.getNode();
This almost gets us there, but fails:
Error at char 9 in expression in xsl:param/@select on line 18 column 57 of l10n.xsl:
FODC0002 I/O error reported by XML parser processing
file:///xsl/docbook-xsl-1.79.2/common/l10n.xsl. Caused by java.io.FileNotFoundException:
/xsl/docbook-xsl-1.79.2/common/l10n.xsl (No such file or directory)
at parameter local.l10n.xml on line 18 column 57 of l10n.xsl:
invoked by global parameter local.l10n.xml at file:///xsl/docbook-xsl-1.79.2/common/l10n.xsl#18
Where that line involves a call to document('')
:
<xsl:param name="local.l10n.xml" select="document('')"/>
Looks like it's insisting on loading itself from a file, and then (obviously) can't find it at that URI. How do we tell whoever is resolving calls to the document()
function to use the classpath?
I have pushed a minimal example of the problem to GitHub: you can clone the repo and run mvn clean test
to reproduce.
I'd also settle for advice on any other approach to getting this done that meets the list of constraints at the top of the post!
I think there are multiple ways to do this. One way to do this would be to add support for accessing resources in the classpath by URLs. This way you could point to the stylesheets in your classpath with a URL, without having to have a catalog in place.
You could do it for example by registering the class below as a URLStreamHandlerProvider
implementation. The implementation is adapted from this answer, but changed to support the optional leading slash in the URL path and also changed to use the cp:
scheme name instead of the more conventional classpath:
.
cp:
is because Saxon-HE (at least version 12.3) appears to have a workaround specific for classpath:
URLs in place, which causes a problem with the leading slash from the path getting dropped off when it resolves relative classpath:
URLs.In Java 9 and above you can register the provider by putting the fully qualified name of the class in the configuration file META-INF/services/java.net.spi.URLStreamHandlerProvider
.
With this in place, you should be able to point to your stylesheets with an URL like cp:/xsl/docbook-xsl-1.79.2/html/docbook.xsl
and have it work without a catalog, including relative imports, as long as your XSLT processor uses (or at least falls back to) this method of dereferencing URLs. Based on a quick test, this approach seems to work with at least the Xalan-Java and Saxon-HE XSLT processors. (I think the default XSLT processor included with Java might have some issues when using the docbook-xsl stylesheets.)
package com.stackoverflow.q76848364;
import java.io.IOException;
import java.net.URL;
import java.net.URLConnection;
import java.net.URLStreamHandler;
import java.net.spi.URLStreamHandlerProvider;
/**
* URL stream handler for "cp:/" URLs for accessing resources in the classpath.
* Supports a leading slash in the the path so that the scheme is treated as a
* hierarchical scheme for resolving relative URL references.
*
* <p>
* Register this provider by putting the fully qualified name of this class in
* the configuration file
* META-INF/services/java.net.spi.URLStreamHandlerProvider.
*/
public class ClasspathURLStreamHandlerProvider extends URLStreamHandlerProvider {
private static final String PROTOCOL = "cp";
@Override
public URLStreamHandler createURLStreamHandler(String protocol) {
if (PROTOCOL.equals(protocol)) {
return new URLStreamHandler() {
@Override
protected URLConnection openConnection(URL url) throws IOException {
String urlPath = url.getPath();
String resourcePath = urlPath.startsWith("/") ? urlPath.substring(1) : urlPath;
return ClassLoader.getSystemClassLoader().getResource(resourcePath).openConnection();
}
};
}
return null;
}
}
When working with relative URI references in Java, please note that there is a bug in the java.net.URI.resolve()
method that affects resolving relative URI references when the relative URI is empty (bug JDK-8218962 in the Java bug database). The docbook-xsl stylesheets rely on this working correctly, so there will be problems if one tries to use anything that relies on the java.net.URI
class for this functionality. Since both Xalan-Java and Saxon-HE seem to work OK, they must be using something else.
I created a pull request demonstrating this solution against the provided minimal example. (The original example was set to target Java 8. Since the method of registering URLStreamHandler
implementations is different between Java 8 and Java 9+, I changed the compile target to Java 9 instead to demonstrate the newer approach.)