Search code examples
javaperformanceapache-poijna

JNA versus POI theoretical performance


Assuming few have benchmarked both JNA and Apache POI in terms of performance, which would be theoretically superior in terms of performance?

Looking into MS Excel parsing and/or building only. I've forked the JNA project and have been using that, but POI seems to be used with Excel much more and am curious if either has performance benefits, theoretically speaking.

If anyone has experience with both or benchmarked, even better. May try this myself at some point as well.


Solution

  • Theoretically, you could get faster performance directly implementing C API functions using JNA.

    Note that I'm referring to direct C API implementation, not the use of the existing COM-based implementation.

    For this theoretical performance improvement you would have to implement those C functions yourself. You would incur a small processing overhead for the Java to Native layer but if you could design your code to minimize the Java/Native calls it's possible that you could create code which would be better at a particular task.

    The existing Excel implementation in JNA is:

    • Windows only
    • A User-contributed implementation, not part of core JNA
    • Relies on the COM layer, which is slower than the direct C API, and probably slower than POI
    • Hasn't been updated recently which indicates it isn't as optimized as Apache Poi, which is under continuous development

    So theoretically, you will, on average, and probably in most cases, get better performance with Apache POI than the existing JNA COM-based code. You will also gain the advantage of a cross-platform, much more widely tested, implemented, and optimized library.

    Theoretically, you can get better performance on a specific platform at a specific task by directly implementing C functions via JNA.