Search code examples
performancejava-native-interface

JNI Performance suffers with simple primitives


I've got a JNI method with the following signature:

JNIEXPORT jboolean JNICALL Java_MovieWriter_addFrameN(JNIEnv *env, jobject obj, jintArray jFrameBuffer, jlong jDuration);

This method belongs to a program that is using native code to export a movie. The program calls this method to add a frame to the movie. The jFrameBuffer is an int[] containing the pixel data and jDuration is time duration of the frame. Simple.

On short movies everything appears to work just fine. However with movies of 5000+ frames, performance suffers (as in, it takes about 1 second for this method to execute, when normally it executes in a very small fraction of a second for the same number of pixels); and eventually the Java program crashes, leaving me the following hs log (I've included just the top, let me know if you'd like to see the whole thing):

# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 691200 bytes for AllocateHeap
# Possible reasons:
#   The system is out of physical RAM or swap space
#   In 32 bit mode, the process size limit was hit
# Possible solutions:
#   Reduce memory load on the system
#   Increase physical memory or swap space
#   Check if swap backing store is full
#   Use 64 bit Java on a 64 bit OS
#   Decrease Java heap size (-Xmx/-Xms)
#   Decrease number of Java threads
#   Decrease Java thread stack sizes (-Xss)
#   Set larger code cache with -XX:ReservedCodeCacheSize=
# This output file may be truncated or incomplete.
#
#  Out of Memory Error (memory/allocation.inline.hpp:61), pid=1880, tid=4008
#
# JRE version: Java(TM) SE Runtime Environment (7.0_45-b18) (build 1.7.0_45-b18)
# Java VM: Java HotSpot(TM) Client VM (24.45-b08 mixed mode windows-x86 )
# Failed to write core dump. Minidumps are not enabled by default on client versions of Windows
#

I have simplified things to narrow down the issue and my method body is now as follows:

JNIEXPORT jboolean JNICALL Java_MovieWriter_addFrameN(JNIEnv *env, jobject obj, jintArray jFrameBuffer, jlong jDuration) {

    jsize jLength = env->GetArrayLength(jFrameBuffer);
    int length = (int)jLength;
    long duration = (long)jDuration;

    //add a dummy frame
    HRESULT hr = S_OK;
    DWORD* pixels = new DWORD[length];
    for (DWORD j = 0; j < length; j++) {
        pixels[j] = 0x0000FF00;
    }
    hr = writer.addFrame(pixels, duration, true);
    delete[] pixels;
    return SUCCEEDED(hr);
}

I have found that if I hard code in values for length and duration instead of linking them to the jni values jLength and jDuration respectively, the performance issue does not occur.

int length = 640 * 480; // (int)jLength;
long duration = 10000000 / 25; // (long)jDuration;

This is astounding to me. Can anyone explain what is going on and how I can fix the problem? The fact that I can't even pass primitives from Java to C without a performance problem sorta seems ludicrous. I must be doing something wrong.

Watching the program memory usage (using the Task Manager), shows that it is not continually increasing, so I don't believe I have any major leaks anywhere, and it is staying well below memory limits I can successfully hit doing other operations.


Solution

  • Ok, here is what was happening: indeed a case of misunderstanding.

    The reason I was experiencing a 'performance delay' was twofold:

    1. When the duration was coming from JNI, it varied.. sometimes it was short, for example 1/25 of a second, other times it was long, for example 3 seconds. When it was long, Windows Media Foundation (what I was using to write the frame) would break it up into framerate-sized pieces - so it would take a 3 second duration frame 3 seconds long would take 75 times longer to write than a 1/25 second frame, if the frame rate was 25 frames per second. So, this was part of my perceived performance delay, but in fact, it was simply doing more work. Fine. That made sense.

    2. The second aspect was related to the fact that somewhere, after sending about 100,000 frames to my Windows Media Foundation IMFSinkWriter, it would bog down. This was the crux of the mystery. Turns out that IMFSinkWriter->NotifyEndOfSegment needed to be called about once every second to keep this from happening. Nowhere did I ever see this documented, and boy did it cause me a lot of grief.