Search code examples
springspring-bootjaxbspring-batchstax

Processing non-root XML elements with Spring Batch StaxEventItemReader


I am trying to read non-root elements from an XML file using Spring Batch.

The batch configuration I am using contains:

  • a StaxEventItemReader configured to read <dependency> elements
  • a Jaxb2Marshaller bound to JAXB-generated classes

How do I configure either StAX or JAXB to parse non-root elements as single Spring Batch items?

For example, let's say I need to process <dependency> elements from a Maven POM:

<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 https://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>

  <groupId>...</groupId>
  <artifactId>...</artifactId>
  <version>...</version>
  <packaging>...</packaging>

  <dependencies>
    <dependency>...</dependency>
    <dependency>...</dependency>
    <dependency>...</dependency>
    ...
  </dependencies>
</project>

With the following code (I am showing only the relevant parts):

@Configuration
@EnableBatchProcessing
public class BatchConfiguration {
    @Bean
    public ItemReader<Dependency> reader(Jaxb2Marshaller marshaller) {
        return new StaxEventItemReaderBuilder<Dependency>().name("itemReader")
                .resource(inputFile)
                .addFragmentRootElements("dependency")
                .unmarshaller(marshaller)
                .build();
    }

    @Bean
    public Jaxb2Marshaller marshaller() {
        Jaxb2Marshaller marshaller = new Jaxb2Marshaller();
        marshaller.setPackagesToScan("org.apache.maven.pom._4_0");
        return marshaller;
    }
}

But I am getting the following error:

javax.xml.bind.UnmarshalException: unexpected element (uri:"http://maven.apache.org/POM/4.0.0", local:"dependency"). Expected elements are <{http://maven.apache.org/POM/4.0.0}project>

What am I missing?


Solution

  • I found a solution: I needed to call Jaxb2Marshaller.setMappedClass to enable partial unmarshalling:

    @Bean
    public Jaxb2Marshaller marshaller() {
        Jaxb2Marshaller marshaller = new Jaxb2Marshaller();
        marshaller.setPackagesToScan("org.apache.maven.pom._4_0");
        marshaller.setMappedClass(Dependency.class); // ADD THIS LINE
        return marshaller;
    }