Search code examples
javaxmlxml-parsingjackson-databindjackson-dataformat-xml

JacksonXML - Deserialize XML Comments <!-- -->


The closest question I can find currently is this which has no conclusive answer:

I am trying to parse XML comments and map it to certain fields in Java.

Below was a simple POC I did in attempt to get this to work.

My current XML model looks something like this:

<Model>
    <!--
    ** some comment here
    -->
    <Value1>foo</Value1>
    <!--
    ** another comment here
    -->
    <Value2>bar</Value2>
</Model>

I have defined my Java model like so:

@JacksonXmlRootElement(localName= "Model")
@JsonDeserialize(using = CustomDeserializer.class)
final class ModelFile {

    @JacksonXmlProperty(localName = "value1")
    private String value1;

    @JacksonXmlProperty(localName = "value2")
    private String value2;

    private List<String> comments;

    // Getters and Setters
}

And my CustomDeserializer like so (Else If statement will not work, this is the part im trying to get working):

public class CustomDeserializerextends JsonDeserializer<ModelFile> {

    @Override
    public ModelFiledeserialize(JsonParser p, DeserializationContext ctxt) throws IOException {
        List<String> comments = new ArrayList<>();
        ModelFile sample = new ModelFile();
        while (p.nextToken() != JsonToken.END_OBJECT) {
            if (p.getCurrentToken() == JsonToken.FIELD_NAME) {
                String fieldName = p.getCurrentName();
                p.nextToken();
                if (fieldName.equals("value1")) {
                    sample.setValue1(p.getText());
                } else if (fieldName.equals("value2")){
                    sample.setValue2(p.getText());
                }
            } else if (p.getText().equals("<!--")) {
                p.nextToken();
                comments.add(p.getText());
                p.nextToken();
            }
        }
        sample.setComments(comments);
        return sample;
    }
}

And finally a sample test to test it's functionality:

@Test
    public void testDeserializeWithCommentsAsList() throws Exception {
        String xml = 
                "<model>\n" +
                "  <!--\n" +
                "  this is a comment \n" +
                "  -->\n" +
                "  <value1>foo</value1>\n" +
                "  <!--\n" +
                "  Another comment \n" +
                "  -->\n" +
                "  <value2>bar</value2>\n" +
                "</yourclass>";
        XmlMapper xmlMapper = new XmlMapper();

        YourClass yourClass = xmlMapper.readValue(xml, Model.class);
        List<String> comments = yourClass.getComments();
        assertNotNull(comments);
        assertEquals(2, comments.size());
        assertEquals("this is a comment", comments.get(0));
        assertEquals("Another comment", comments.get(1));
    }

The problem as I understand it is the CustomDeserializer seems to recognize both value1 and value2 as valid fields allowing me to later map the fields directly. XML Comments however are not recognized at all.

I say XML Comments specifically because if I change this to a more standard comment type such as this:

    "<model>\n" +
    "  // this is a comment \n" +
    "  <value1>foo</value1>\n" +
    "  // another comment \n" +
    "  <value2>bar</value2>\n" +
    "</yourclass>";

Then the JsonParser in the CustomDeserializer will pick up the comments. Looking further into Jsonparser I see Features that can be enabled namely ALLOW_COMMENTS which does recognize typical Java/C++ comments or ALLOW_COMMENTS_YAML which looks specifically for # type comments.

https://fasterxml.github.io/jackson-core/javadoc/2.8/com/fasterxml/jackson/core/JsonParser.Feature.html#ALLOW_COMMENTS

My guess is considering JacksonXML extends of JSON and typically comments are not supported in JSON, this is not really viable? Considering that it could technically work for // type comments, I feel like im really close to getting this to work.

The question for this would be: How do I handle <!-- --> type comments in XMLs while using jackson-databind-xml to deserialize it into POJO?

Any idea or suggestions are welcomed.


Solution

  • Ultimately, no this cannot be done it seems. Adding a @JsonDeserializer on the root Object and then having a look at the Jsonparser that is being used revealed that by the time JacksonXML attempts to process the XML, all traces of the comments has been removed (Specifically again <!-- --> type comments!)

    The alternative solution I did, which was suggested by user LMC was to read the xml as a stream and look for events that are Comment type. I would then take the comment and push it into a List which is eventually returned.

    Really not ideal as the XML is being read twice (Once for Jackson Mapping and once more to collect the comments) however for my scenario we've decided to go that way as the performance reading it twice vs once is negligible and to read and manually populate my business object without Jackson would take a lot of time to implement.