Search code examples
xpathjythonjcabi

'Prefix must resolve to a namespace: p' when using com.jcabi.jcabi-xml


I am using jython, poi and jcabi to parse a pptx document.

The goal is to find out any animation definitions inside the docuemnt

Here is the snippet.

import sys
from java.io import FileInputStream,
from com.jcabi.xml import XMLDocument

ffn = os.path.abspath(fn)
fis = FileInputStream(ffn)
ppt = XMLSlideShow(fis)

p = ppt.getSlides()[0]

xml = p.getXmlObject()
doc = XMLDocument(str(xml))

print doc.xpath("/*/p:anim")

At the last line (the print statement) I got an exception

java.lang.IllegalArgumentException: java.lang.IllegalArgumentException: Invalid XPath query '/*/p:anim' at com.sun.org.apache.xpath.internal.jaxp.XPathFactoryImpl: javax.xml.transform.TransformerException: Prefix must resolve to a namespace: p

I have add the following line before print but the error is still there

doc.registerNs('p', 'http://schemas.openxmlformats.org/presentationml/2006/main')

My question:

How can I remove the 'prefix must resolve' error message?

More information:

Here is a sample fragment of the xml document

    <?xml version="1.0" encoding="UTF-8" standalone="no"?>
    <xml-fragment xmlns:a="http://schemas.openxmlformats.org/drawingml/2006/main" 
xmlns:p="http://schemas.openxmlformats.org/presentationml/2006/main" 
xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
...
  <p:set>
                                      <p:cBhvr>
                                        <p:cTn dur="1" fill="hold" id="6">
                                          <p:stCondLst>
                                            <p:cond delay="0"/>
                                          </p:stCondLst>
                                        </p:cTn>
                                        <p:tgtEl>
                                          <p:spTgt spid="2"/>
                                        </p:tgtEl>
                                        <p:attrNameLst>
                                          <p:attrName>style.visibility</p:attrName>
                                        </p:attrNameLst>
                                      </p:cBhvr>
                                      <p:to>
                                        <p:strVal val="visible"/>
                                      </p:to>
                                    </p:set>
                                    <p:anim calcmode="lin" valueType="num">
                                      <p:cBhvr additive="base">
                                        <p:cTn dur="500" fill="hold" id="7"/>
                                        <p:tgtEl>
                                          <p:spTgt spid="2"/>
                                        </p:tgtEl>
                                        <p:attrNameLst>
                                          <p:attrName>ppt_x</p:attrName>
                                        </p:attrNameLst>
                                      </p:cBhvr>
                                      <p:tavLst>
                                        <p:tav tm="0">
                                          <p:val>
                                            <p:strVal val="#ppt_x"/>
                                          </p:val>
                                        </p:tav>
                                        <p:tav tm="100000">
                                          <p:val>
                                            <p:strVal val="#ppt_x"/>
                                          </p:val>
                                        </p:tav>
                                      </p:tavLst>
                                    </p:anim>

Here is the pom.xml I use to manage the dependency with maven

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>com.mycompany.utils</groupId>
  <artifactId>ppt</artifactId>
  <packaging>jar</packaging>
  <version>1.0-SNAPSHOT</version>
  <name>ppt-viewer</name>
  <url>http://maven.apache.org</url>
  <properties>
    <target.file></target.file>
  </properties>
  <build>
    <plugins>
      <plugin>
        <groupId>org.codehaus.mojo</groupId>
        <artifactId>exec-maven-plugin</artifactId>
        <version>1.6.0</version>
        <executions>
          <execution><goals><goal>exec</goal></goals></execution>
        </executions>
        <configuration>
          <executable>jython</executable>
          <arguments>
            <argument>-J-cp</argument>
            <classpath/>
            <argument>use_poi.jy</argument>
            <argument>${target.file}</argument>
          </arguments>
        </configuration>
      </plugin>
    </plugins>
  </build>
  <dependencies>
    <dependency>
      <groupId>com.jcabi</groupId>
      <artifactId>jcabi-xml</artifactId>
      <version>0.21.4</version>
    </dependency>
    <dependency>
      <groupId>org.apache.poi</groupId>
      <artifactId>poi-scratchpad</artifactId>
      <version>4.0.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.poi</groupId>
      <artifactId>poi-ooxml-schemas</artifactId>
      <version>4.0.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.poi</groupId>
      <artifactId>poi-ooxml</artifactId>
      <version>4.0.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.poi</groupId>
      <artifactId>poi-excelant</artifactId>
      <version>4.0.0</version>
    </dependency>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>3.8.1</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.poi</groupId>
      <artifactId>poi</artifactId>
      <version>4.0.0</version>
    </dependency>
    <dependency>
      <groupId>org.apache.xmlbeans</groupId>
      <artifactId>xmlbeans</artifactId>
      <version>3.0.1</version>
    </dependency>
    <dependency>
      <groupId>org.apache.commons</groupId>
      <artifactId>commons-compress</artifactId>
      <version>1.18</version>
    </dependency>
    <dependency>
      <groupId>org.apache.poi</groupId>
      <artifactId>ooxml-schemas</artifactId>
      <version>1.4</version>
    </dependency>
    <dependency>
      <groupId>org.apache.commons</groupId>
      <artifactId>commons-collections4</artifactId>
      <version>4.2</version>
    </dependency>
  </dependencies>
</project>

Solution

  • You need to change

    doc.registerNs('p', 'http://schemas.openxmlformats.org/presentationml/2006/main')
    

    to

    XML xml = doc.registerNs('p', 'http://schemas.openxmlformats.org/presentationml/2006/main')

    and perform your operations on the xml object that has the registered namespace. The instruction by itself does nothing to the doc object. It returns an object with the registered namespace.

    Cheers!