Passing a byte array from java to python on Android using Chaquopy

I'm running an android camera app and I would like to do the image processing in Python. To test this, I want to pass a single image frame to a python function, divide all values by 2 using integer division and return the result.

For that end, I have the following code:

in Java:

public void onCapturedImage(Image image)
    {

        Image.Plane[] tmp = image.getPlanes();
        byte[] bytes = null;
        ByteBuffer buffer = tmp[0].getBuffer();
        buffer.rewind();
        bytes = new byte[buffer.remaining()];
        buffer.get(bytes, 0, buffer.remaining());
        buffer.rewind();

        Log.d(TAG, "start python section");

        // assume python.start() is elsewhere

        Python py = Python.getInstance();
        PyObject array1 = PyObject.fromJava(bytes);
        Log.d(TAG, "get python module");
        PyObject py_module = py.getModule("mymod");
        Log.d(TAG, "call pic func");

        byte [] result  = py_module.callAttr("pic_func", array1).toJava(byte[].class);
        // compare the values at some random location to see make sure result is as expected
        Log.d(TAG, "Compare: "+Byte.toString(bytes[33]) + " and " + Byte.toString(result[33]));
        Log.d(TAG,"DONE");

    }

In python, I have the following:

import numpy as np

def pic_func(o):
    a = np.array(o)
    b = a//2
    return b.tobytes()

I have several issues with this code.

It does not behave as expected - the value at location 33 is not half. I probably have a mix-up with the byte values, but I'm not sure what's going on exactly. The same code without "tobytes" and using a python list rather than a numpy array does work as expected.
Passing parameters - not sure what happens under the hood. Is it pass by value or by reference? Is the array being copied, or just a pointer being passed around?
It is SLOW. it takes about 90 seconds to compute this operation over 12 million values. Any pointers on speeding this up?

Thanks!

Solution

Your last two questions are related, so I'll answer them together.

PyObject array1 = PyObject.fromJava(bytes)
py_module.callAttr("pic_func", array1)

This passes by reference: the Python code receives a jarray object which accesses the original array.

np.array(o)

As of Chaquopy 8.x, this is a direct memory copy when o is a Java primitive array, so performance shouldn't be a problem. On older versions of Chaquopy, you can avoid a slow element-by-element copy by converting to a Python bytes object first, which can be done in either language:

In Java: PyObject array1 = py.getBuiltins().callAttr("bytes", bytes)
Or in Python: np.array(bytes(o))

b.tobytes()
toJava(byte[].class)

Both of these expressions will also make a copy, but they will also be direct memory copies, so performance shouldn't be a problem.

As for it returning the wrong answer, I think that's probably because NumPy is using its default data type of float64. When calling np.array, you should specify the data type explicitly by passing dtype=np.int8 or dtype=np.uint8. (If you search for byte[] in the Chaquopy documentation you'll find the exact details of how signed/unsigned conversion works, but it's probably easier just to try both and see which one gives the answer you expect.)