Search code examples
androidmultithreadingperformancepixel-shadernexus-4

What causes the periodic performance spikes that are seen when doing computationally expensive array processing on the Nexus 4?


I'm new to threads (don't kill me for my implementation below :) and I need to do multiple blurring passes of pixels on a separate thread (see below). It's not the most efficient implementation of box blur (it's from Gaussian Filter without using ConvolveOp) but the performance spikes don't occur on the Nexus 7 tablet but they do occur on the Nexus 4 phone.

I've posted my testing sample (running on Android 4.2 - see below).

I don't think it is caused by the GC thrashing the memory (it doesn't coincide with the spikes).

I think it might be something to do with cache locality or hardware memory thrashing - but I'm not sure.

What would cause the spikes? Sometimes they are sudden onset - e.g. spike of 50%. Sometimes they are slow onset - e.g. spikes increasing/decreasing monotonically, with spikes as follows -> 5%, 10%, 20%, 10%, 5%.

How could I stop them from occurring when doing heavy array processing?

This doesn't occur on the Nexus 7 tablet which I have also tested (see results below)

Side question: What is the best way to sleep and restart my thread correctly (new to threads)?


MainActivity.java


package com.example.test;

import android.os.Bundle;
import android.app.Activity;

public class MainActivity extends Activity {

    private MainThread thread;

    @Override
    protected void onCreate(Bundle savedInstanceState) {
        super.onCreate(savedInstanceState);

        thread = new MainThread();

        thread.setRunning(true);
        thread.start();

        setContentView(R.layout.activity_main);
    }

    @Override
    protected void onResume() {

        super.onResume();
        thread.setRunning(true);

    }

    @Override
    protected void onPause() {

        super.onPause();
        thread.setRunning(false);

    }

}

MainThread.java


package com.example.test;

import android.util.Log;

public class MainThread extends Thread {

    int[] pixels;
    int kernel_rows = 2;
    int kernel_cols = 2;
    int width = 512;
    int height = 512;

    @Override
    public void run() {

        while (running) {

            long start = System.currentTimeMillis();

            for (int row = kernel_rows / 2; row < height - kernel_rows / 2; row++) {
                for (int col = kernel_cols / 2; col < width - kernel_cols / 2; col++) {

                    float pixel = 0;

                    // iterate over each pixel in the kernel
                    for (int row_offset = 0; row_offset < kernel_rows; row_offset++) {
                        for (int col_offset = 0; col_offset < kernel_cols; col_offset++) {

                            // subtract by half the kernel size to center the
                            // kernel
                            // on the pixel in question
                            final int row_index = row + row_offset
                                    - kernel_rows / 2;
                            final int col_index = col + col_offset
                                    - kernel_cols / 2;

                            pixel += pixels[row_index * width + col_index] * 1.0f / 4.0f;

                        }
                    }

                    pixels[row * width + col] = (int) pixel;

                }

            }

            long stop = System.currentTimeMillis();

            long delta = stop - start;

            Log.d("DELTA", Long.toString(delta));

        }

    }

    private boolean running;

    public void setRunning(boolean running) {

        this.pixels = new int[512 * 512];
        this.running = running;

    }

}

Logs


Nexus 4 phone (ms):

01-13 10:56:05.663: D/DELTA(13507): 76
01-13 10:56:05.773: D/DELTA(13507): 107
01-13 10:56:05.843: D/DELTA(13507): 77
01-13 10:56:05.923: D/DELTA(13507): 75
01-13 10:56:06.053: D/DELTA(13507): 127
01-13 10:56:06.133: D/DELTA(13507): 78
01-13 10:56:06.213: D/DELTA(13507): 81
01-13 10:56:06.293: D/DELTA(13507): 80
01-13 10:56:06.353: D/DELTA(13507): 77
01-13 10:56:06.433: D/DELTA(13507): 79
01-13 10:56:06.513: D/DELTA(13507): 79
01-13 10:56:06.624: D/DELTA(13507): 106
01-13 10:56:06.694: D/DELTA(13507): 76

Nexus 7 tablet (ms):

01-13 11:01:03.283: D/DELTA(3909): 84
01-13 11:01:03.373: D/DELTA(3909): 85
01-13 11:01:03.453: D/DELTA(3909): 85
01-13 11:01:03.543: D/DELTA(3909): 84
01-13 11:01:03.623: D/DELTA(3909): 85
01-13 11:01:03.703: D/DELTA(3909): 84
01-13 11:01:03.793: D/DELTA(3909): 85
01-13 11:01:03.873: D/DELTA(3909): 84
01-13 11:01:03.963: D/DELTA(3909): 85
01-13 11:01:04.043: D/DELTA(3909): 84

Solution

  • I think I may have somewhat mitigated this effect on the Nexus 4. There is still some variability in calculation consistency but it is bearable - I think - can't see too many huge spikes - outside of thread startup/shutdown. I've done this using the Android NDK and c p_threads to spawn a native thread that is mostly left alone by Java (or so I'm told) until the foreground application is changed or closed.

    Here is the code:


    MainActivity.java


    package com.example.test;
    
    import android.os.Bundle;
    import android.app.Activity;
    
    public class MainActivity extends Activity {
    
        static {
    
            System.loadLibrary("native");
    
        }
    
        private native void init();
    
        @Override
        protected void onCreate(Bundle savedInstanceState) {
    
            super.onCreate(savedInstanceState);
    
            // Initializes and spawns native thread
            init();
    
            setContentView(R.layout.activity_main);
    
        }
    
    }
    

    native.c


    (which should be put into a jni folder at the root of the Android project)

    #include <time.h>
    #include <pthread.h>
    #include <jni.h>
    #include <android/log.h>
    
    #define APPNAME "DELTA"
    
    int* pixels;
    int kernel_rows = 2;
    int kernel_cols = 2;
    int width = 60;
    int height = 39;
    
    int running = 1;
    
    // from android samples
    /* return current time in milliseconds */
    static double now_ms(void) {
    
        struct timespec res;
        clock_gettime(CLOCK_REALTIME, &res);
        return 1000.0 * res.tv_sec + (double) res.tv_nsec / 1e6;
    
    }
    
    // initialize thread/begin it
    jint Java_com_example_testa_MainActivity_init(JNIEnv* env, jobject javaThis) {
    
        int i1 = 1;
    
        pthread_t thread;
        void *run();
    
        pthread_create(&thread, NULL, run, &i1);
        pthread_join(thread, NULL);
    
        return 0;
    
    }
    
    // thread function
    void *run(int *x) {
    
        // init pixels within thread
        pixels = (int*) malloc(sizeof(int) * width * height);
    
        // loop until stopped - java won't interfere
        // unless closed/switch application (or so I'm told)
        while (running) {
    
            double start = now_ms();
    
            int row, col, row_offset, col_offset;
    
            for (row = kernel_rows / 2; row < height - kernel_rows / 2; row++) {
    
                for (col = kernel_cols / 2; col < width - kernel_cols / 2; col++) {
    
                    float pixel = 0;
    
                    // iterate over each pixel in the kernel
                    for (row_offset = 0; row_offset < kernel_rows; row_offset++) {
                        for (col_offset = 0; col_offset < kernel_cols;
                                col_offset++) {
    
                            // subtract by half the kernel size to center the
                            // kernel
                            // on the pixel in question
    
                            int row_index = row + row_offset - kernel_rows / 2;
                            int col_index = col + col_offset - kernel_cols / 2;
    
                            pixel += pixels[row_index * width + col_index] * 1.0f
                                    / 4.0f;
    
                        }
    
                    }
    
                    pixels[row * width + col] = (int) pixel;
    
                }
    
            }
    
            double end = now_ms();
    
            double delta = end - start;
    
            __android_log_print(ANDROID_LOG_VERBOSE, APPNAME, "%f", delta);
    
        }
    
        pthread_exit(0);
    
    }
    

    Android.mk


    (which should be put into a jni folder at the root of the Android project)

    LOCAL_PATH := $(call my-dir)
    MY_PATH := $(LOCAL_PATH)
    include $(call all-subdir-makefiles)
    
    include $(CLEAR_VARS)
    
    LOCAL_PATH := $(MY_PATH)
    
    LOCAL_MODULE    := native
    LOCAL_LDLIBS := -llog
    LOCAL_SRC_FILES := native.c
    
    include $(BUILD_SHARED_LIBRARY)
    

    Summary


    Reduced cost of code by ~20-30% and decreased variability by an order of magnitude.

    The code is compiled by executing the ndk-build command from the NDK library provided by Android at the root folder (found here: http://developer.android.com/tools/sdk/ndk/index.html).


    Results


    Nexus 4 (ms):

    01-14 13:41:21.132: V/DELTA(23679): 56.554199
    01-14 13:41:21.192: V/DELTA(23679): 58.568604
    01-14 13:41:21.252: V/DELTA(23679): 59.484131
    01-14 13:41:21.302: V/DELTA(23679): 56.768066
    01-14 13:41:21.362: V/DELTA(23679): 54.692383
    01-14 13:41:21.412: V/DELTA(23679): 51.823730
    01-14 13:41:21.472: V/DELTA(23679): 55.668945
    01-14 13:41:21.522: V/DELTA(23679): 56.920654
    01-14 13:41:21.582: V/DELTA(23679): 56.371094
    01-14 13:41:21.642: V/DELTA(23679): 58.507568
    01-14 13:41:21.702: V/DELTA(23679): 59.697754
    01-14 13:41:21.752: V/DELTA(23679): 53.990723
    01-14 13:41:21.812: V/DELTA(23679): 55.669189
    

    Nexus 7 (ms):

    01-14 13:41:25.685: V/DELTA(2916): 65.867920
    01-14 13:41:25.745: V/DELTA(2916): 65.986816
    01-14 13:41:25.815: V/DELTA(2916): 66.685059
    01-14 13:41:25.885: V/DELTA(2916): 67.033936
    01-14 13:41:25.945: V/DELTA(2916): 65.703857
    01-14 13:41:26.015: V/DELTA(2916): 66.653076
    01-14 13:41:26.085: V/DELTA(2916): 66.922119
    01-14 13:41:26.145: V/DELTA(2916): 67.030029
    01-14 13:41:26.215: V/DELTA(2916): 67.014893
    01-14 13:41:26.285: V/DELTA(2916): 67.034912
    01-14 13:41:26.345: V/DELTA(2916): 67.089844
    01-14 13:41:26.415: V/DELTA(2916): 65.860107
    01-14 13:41:26.485: V/DELTA(2916): 65.642090
    01-14 13:41:26.545: V/DELTA(2916): 65.574951
    01-14 13:41:26.615: V/DELTA(2916): 65.991943