Search code examples
javaxmlniostax

StAX parsing from Java NIO channel


I am attempting to receive a stream of XML events over a Java NIO channel. I am new to both NIO and StAX parsing, so I could very easily be overlooking something :)

My search has led me to several SAX and StAX implementations, but they all seem to operate on InputStreams and InputSources--not NIO channels. The two closest attempts I have made have been to get the InputStream from the channel and create a PipedInputStream:

// method 1
PipedOutputStream out = new PipedOutputStream();
InputStream in = new PipedInputStream(out);
PrintWriter writer = new PrintWriter(out);

//method 2
InputStream in = channel.socket().getInputStream()
//method 3
IputStream in = Channels.newInputStream(channel);

followed by:

XMLStreamReader xmlStreamReader = XMLInputFactory.newInstance()
        .createXMLStreamReader(in);
//...

When the above code is used with method 1, it blocks on the createXMLStreamReader line. When methods 2/3 are used, they immediately throw IllegalBlockingModeException (I do understand why). Maybe a new approach is needed?

My goal is to have a non-blocking server select => accept character data from a client => parse it to XML events using a specific encoding => forward that event object to another thread for processing => and return to the selecting.

So am I overlooking something, or is there a better approach that can be used? If so what?

Thanks!


Solution

  • Are you sure you need to use NIO? It may not offer the relative benefits originally expected:

    Paul Tyma: Kill the myth please. NIO is not faster than IO

    Paul Tyma: Writing Java Multithreaded Servers - whats old is new

    A stack showing where inside createXMLStreamReader() it is blocking could help, but it's probably behaving as designed. If it was designed to work against InputStreams which always either (1) give the expected amount of data; (2) end; or (3) block, then it won't automatically behave in a (usually more complicated and stateful) way that can return after reading any amount of incomplete input, without lots of deep reworking.