How can I improve performance for an MQ based batch application?

I have an application where messages keep coming at a rate of 70K XMLs per hour. We consume these XML messages and store it into an intermediate queue. The intermediate queue is created because we need to meet SLA of consuming all the messages with 24 hours. We are able to consume and load the XMLS into the internal queue within 24 hours. After loading it to internal queue, we process the XMLS (parse, apply very few transformation, perform very few validations) and store the data to a heavily normalized data model. I know that the datamodel can have a huge impact on performance, unfortunately, we have no control over the datamodel. Currently, we take 3.5 minutes to process 2K messages, which is unacceptable. We want to bring it down to 1 minute for 2K messages. Here is what we have done so far:

1) Applied indexes wherever applicable.
2) Use XMLBeans for parsing the XMLs (size of each XML is not very huge)
3) Removed all unnecessary validations, transformatios, etc.

The application runs on:
Operating system: RHEL 5.4 64 bit
Platform: JDK 1.6.0_17, 64 bit
Database: Oracle 11g R2 64 bit (2 node cluster)
External MQ: IBM Queue
Internal temporary storage MQ: JBoss MQ
Application Server: Jboss 5.1.0.GA (EAP Version)

The order in which we consume and process the XML messages is very important and so we cannot do a parallel processing.

Is there anything else we can do to improve performance?

Solution

WebSphere MQ, even on a small server, can unload messages MUCH faster than the rate you describe. The Windows Performance Report for WMQ V7 tested at more than 2,200 2k persistent round trips (one request and one reply) per second over client channels. That's more than 4,000 messages per second.

The bottleneck in your case would seem to be the latency of processing messages and the dependency on processing the messages in a particular order. The option that could give you the MOST performance boost would be to eliminate the order dependency. When I worked at a bank we had a system that posted transactions in the exact order they arrived and everyone said this requirement was mandatory. However, we eventually revised the system to perform a memo-post during the day and then repost in a later step. The memo-posting occurred in any order and supported parallelism, failover and all the other benefits of multi-instance processing. The final post applied the transactions in logical order (and in fact in an order that was most favorable to the customer) once they were all in the DB. Sequence dependencies lock you into a singleton model and are a worst-case requirement for asynch messaging. Eliminate them if at all possible.

The other area for improvement will be in the parsing and processing of the messages. As long as you are stuck with sequence dependencies, this is the best bet for improving performance.

Finally, you always have the option to throw money at the problem in the form of more memory, CPU, faster disk I/O and so forth. Essentially this is addressing software architecture with horsepower and is never the best solution but often it buys you enough time to address the root cause.