With a help of xsl script I extract the url to a file from XML. The ending of this url is: api/v1/objects/uuid/b79de4e5-8d1f-4840-b85f-e052db92a52f/file/id/1001974122/file_version/name/small/disposition/inline
When I enter this url in web browser it will be transformed to URL with file extension at the ending eas/partitions-inline/48/1001/1001974000/1001974122/9a4191c7ce7414650d36ac9bc1c2b012261013ad/image/png/8223@33a8cae1-a9fa-4655-8c3d-b71241bbc99b_1001974122_small.png
Is there a way to do this transformation with xsl without a browser?
I need the url with a file extension in my output xml in order to run the harvester over it.
The question is very informal about the URL transformation (and the XML tooling used), but let's assume 3xx response to original URL and the intent to output the result URL. For instance:
$ curl --silent --head http://stackoverflow.com | grep Location
Location: https://stackoverflow.com/
To to do the same thing while transforming XML the XSLT processor needs to have a HTTP client. There is HTTP Client module in EXPath, collection of XPath extension specification with implementations.
To quickly install EXPath there's installer available on download page. It comes with Saxon XSLT processor. At the time of writing it refers to expath-repo-installer-0.13.1.jar
. Run it like:
java -jar expath-repo-installer-0.13.1.jar
Once installed download the HTTP client module for Saxon, expath-http-client-saxon-0.12.0.zip
and extract expath-http-client-saxon-0.12.0.xar
out of it. Then install it to EXPath repository:
mkdir repo
bin/xrepo --repo repo install /path/to/expath-http-client-saxon-0.12.0.xar
Then you can use bin/saxon
.
data.xml
<?xml version="1.0" encoding="utf-8"?>
<data>
<datum><url>http://python.org</url></datum>
<datum><url>http://stackoverflow.com</url></datum>
</data>
text.xslt
<?xml version="1.0" encoding="utf-8"?>
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:http="http://expath.org/ns/http-client"
exclude-result-prefixes="#all"
version="2.0">
<xsl:import href="http://expath.org/ns/http-client.xsl"/>
<xsl:output method="xml" encoding="utf-8" indent="yes"/>
<xsl:template match="/">
<result>
<xsl:for-each select="data/datum">
<!-- the request element -->
<xsl:variable name="request" as="element(http:request)">
<http:request method="head" follow-redirect="false">
<xsl:attribute name="href">
<xsl:value-of select="url"/>
</xsl:attribute>
</http:request>
</xsl:variable>
<!-- sending the request -->
<xsl:variable name="response" select="http:send-request($request)"/>
<!-- output -->
<url>
<orig><xsl:value-of select="url"/></orig>
<location>
<xsl:value-of
select="$response[1]/header[@name='location']/@value"/>
</location>
</url>
</xsl:for-each>
</result>
</xsl:template>
</xsl:stylesheet>
See the module's spec for more details about how to control the HTTP client.
Then bin/saxon --repo repo data.xml test.xslt
produces:
<?xml version="1.0" encoding="utf-8"?>
<result>
<url>
<orig>http://python.org</orig>
<location>https://python.org/</location>
</url>
<url>
<orig>http://stackoverflow.com</orig>
<location>https://stackoverflow.com/</location>
</url>
</result>