Search code examples
javaxml

Parsing multiple XMLs in ZipInputStream


I have an export feature, where I pack two XMLs into a ZIP file to be downloaded. In my UnitTest I want to assert that the export was correct so I read the ZIP file again and am trying to parse the XML. I am using a ZipInputStream and the following approach works:

InputStream result = testee.openStream();//here the ZIP and XMLs are created
try (var zis = new ZipInputStream(result)) {
    var firstEntry = zis.getNextEntry();
    assertNotNull(firstEntry);
    var nodesOne = getNodesByName(zis, "nodeOne");
    assertEquals(EXPECTED_NODE_ONE, nodesOne.getLength());
    
    var secondEntry = zis.getNextEntry();
    assertNotNull(secondEntry);
    var nodesTwo = getNodesByName(zis, "nodeTwo");
    assertEquals(EXPECTED_NODE_TWO, assertNotNull.getLength());
}

NodeList getNodesByName(ZipInputStream zis, String nodeName) {
    var xmlString = IOUtils.toString(zis);
    var is = new InputSource(new StringReader(xmlString));
    var doc = DocumebtBuilderFactory.newInstance().newDocumentBuilder().parse(is);
    return doc.getElementsByTagName(nodeName);
}

But a more elegant way would be to replace the first two lines this in the getNodesByName with this:
var is = new InputSource(new InputStreamReader(zis));

This works for the first entry, but the second entry throws an exception as apparently the InputStremReader closes the ZipInputStream or reads it in full and not just the entry. Is there a way to use streams that respect that it should only read the first entry?


Solution

  • I'm guessing that IOUtils.toString(zis) or DocumentBUilder.parse(...) is closing the ZipInputStream.

    You could try this instead

    NodeList getNodesByName(ZipInputStream zis, String nodeName) {
        var streamWrapper = new java.io.FilterInputStream(zis) {
            @Override
            public void close() {
                // don't close the underlying stream
            }
        };
        var inputSource = new InputSource(streamWrapper);
        var doc = DocumentBuilderFactory.newInstance().newDocumentBuilder().parse(inputSource);
        return doc.getElementsByTagName(nodeName);
    }