Search code examples
javaxsdxml-parsingjaxbjaxb2-maven-plugin

JAXB XML unmarshal with / in attribute value


I'm having an issue unmarshaling an XML file when some special characters as "/" are included inside one attribute's value like this one:

<field name = "test" value = "test&/"/>

I'm using the libraries woodstox-core (v5.0.3) and stax2-api (3.1.4)

The attribute value is defined in the XSD as a normalized String, that I think allows the character "/":

<xs:element name="field" maxOccurs="unbounded">
    <xs:complexType>
        <xs:attribute name="name" type="xs:token" use="required" />
        <xs:attribute name="value" type="xs:normalizedString" use="required" />
    </xs:complexType>
</xs:element>

But when making the unmarshal call, the exception is thrown:

XMLStreamReader xsr = null;
try {
    // Create the XML stream reader
    XMLInputFactory xif = XMLInputFactory.newFactory();
    xsr = xif.createXMLStreamReader(inputStream, "UTF-8");

    // Unmarshall the XML with JAXB, with XML schema validation enabled
    JAXBContext jc = JAXBContext.newInstance(Root.class);
    Unmarshaller unmarshaller = jc.createUnmarshaller();
    unmarshaller.setSchema(this.xmlSchema);
    Root rootIndex = (Root) unmarshaller.unmarshal(xsr);
    [...]
}

And here the exception:

Caused by: com.ctc.wstx.exc.WstxUnexpectedCharException: Unexpected character '/' (code 47) (expected a name start character)
 at [row,col {unknown-source}]: [17,74]
    at com.ctc.wstx.sr.StreamScanner.throwUnexpectedChar(StreamScanner.java:653) [woodstox-core-5.0.3.jar:5.0.3]
    at com.ctc.wstx.sr.StreamScanner.parseFullName(StreamScanner.java:1933) [woodstox-core-5.0.3.jar:5.0.3]
    at com.ctc.wstx.sr.StreamScanner.parseEntityName(StreamScanner.java:2058) [woodstox-core-5.0.3.jar:5.0.3]
    at com.ctc.wstx.sr.StreamScanner.fullyResolveEntity(StreamScanner.java:1525) [woodstox-core-5.0.3.jar:5.0.3]
    at com.ctc.wstx.sr.BasicStreamReader.parseAttrValue(BasicStreamReader.java:2017) [woodstox-core-5.0.3.jar:5.0.3]
    at com.ctc.wstx.sr.BasicStreamReader.handleNsAttrs(BasicStreamReader.java:3145) [woodstox-core-5.0.3.jar:5.0.3]
    at com.ctc.wstx.sr.BasicStreamReader.handleStartElem(BasicStreamReader.java:3043) [woodstox-core-5.0.3.jar:5.0.3]
    at com.ctc.wstx.sr.BasicStreamReader.nextFromTree(BasicStreamReader.java:2919) [woodstox-core-5.0.3.jar:5.0.3]
    at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1123) [woodstox-core-5.0.3.jar:5.0.3]
    at com.sun.xml.bind.v2.runtime.unmarshaller.StAXStreamConnector.bridge(StAXStreamConnector.java:197) [jaxb-impl-2.2.3-1.jar:2.2.3]
    at com.sun.xml.bind.v2.runtime.unmarshaller.UnmarshallerImpl.unmarshal0(UnmarshallerImpl.java:366) [jaxb-impl-2.2.3-1.jar:2.2.3]
    ... 16 more

Is there anything else I need to defined to accept those characters (apart of UTF-8) or is it simply not allowed?

Many thanks in advance!


Solution

  • The issue here was not really the / character, but the & before it. / is ok by itself, but & needs to be escaped. I was too focused on the / due to the error message.

    Escaping the & like that fixed the issue:

    <field name = "test" value = "test&amp;/"/>