I've been having a terrible time finding any examples of XSTL processing with the python libxml2 library and XSLT. I have a set of legacy documents with a default namespace, and I've been trying convert them into something I can import into a tinkerpop-compliant database. The legacy data has a default namespace, and I can't figure out how to convince libxslt to find anything in the data.
As you can see from my examples, I can't seem to get anything from an inner template to render at all. It does seem to find the topmost (cmap) template, as it spits out the <graphml>
boilerplate. I am fairly new to XSLT, so this may be just a shortcoming, but nobody on SO or the google seems to have any examples of this working.
I've thought about just ripping the offending default namespace out with a regexp, but parsing XML with a regexp is usually a bad plan, and it just seems like the wrong idea.
I have the following XML:
<?xml version="1.0" encoding="UTF-8"?>
<cmap xmlns="http://cmap.ihmc.us/xml/cmap/">
<map width="1940" height="3701">
<concept-list>
<concept id="1JNW5YSZP-14KK308-5VS2" label="Solving Linear
Systems by
Elimination
[MAT.ALG.510]"/>
<concept id="1JNW55K3S-27XNMQ0-5T80" label="Using
Inequalities
[MAT.ALG.423]"/>
</concept-list
</map>
</cmap>
There's much more, but this is a sample of it. I was able, using the xpathRegisterNS()
command, to register the default namespace and find my map, concept-map, etc with it. I have not had the same luck when trying to process this with libxslt.
<?xml version="1.0" encoding="UTF-8" ?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:c="http://cmap.ihmc.us/xml/cmap/">
<xsl:output method="xml" indent="yes"/>
<xsl:template match="c:cmap">
<graphml xmlns="http://graphml.graphdrawing.org/xmlns">
<xsl:apply-templates select="c:concept"/>
</graphml>
</xsl:template>
<xsl:template match="c:concept">
<node> Found a node </node>
</xsl:template>
</xsl:stylesheet>
And the python experiment is just:
import libxml2
import libxslt
styledoc = libxml2.parseFile("cxltographml.xsl")
style = libxslt.parseStylesheetDoc(styledoc)
doc = libxml2.parseFile("algebra.cxl")
result = style.applyStylesheet(doc, None)
print style.saveResultToString(result)
You've got the right technique regarding namespaces in the xslt, namely you must map the uri to a prefix as the "default namespace" doesn't apply to xpaths or template match expressions. The problem is that in your c:cmap
template you're doing
<xsl:apply-templates select="c:concept"/>
But the cmap
element doesn't have any direct children named concept
. Try
<xsl:apply-templates select="c:map/c:concept-list/c:concept"/>
or more generally (but potentially less efficient)
<xsl:apply-templates select=".//c:concept"/>
to find all descendant concept
elements rather than just immediate children.
Also, in the c:concept
template you will need to add xmlns="http://graphml.graphdrawing.org/xmlns"
to the <node>
element otherwise it will be output in no namespace (with xmlns=""
).