Search code examples
javajava-native-interfaceprotocol-buffersoutputstreambytebuffer

Efficiently passing GPB serialized data from Java to C++ using JNI


These days I'm looking for a way to pass complex data structures from Java to native (C++) dll and vice versa. After reading many articles about JNI and it's overhead, I started to look for an effecient way to serialize the data on Java/native side and pass it as one chunk of data to the other side. This way, many JNI calls could be saved. Currently I use Google Protocol Buffers to serialize many complex classes and pass this data via single JNI call to the native layer. It looks good. For example, serialization of 4000 small classes on Java side, JNI call and deserialization on the native side are taking ~1.6ms on i7.

Serialization on java side is done by calling GPB_GENERATED_CLASS.build.toByteArray() which actualy creates a new byte array on every call. After that, array data is copied into direct ByteBuffer using the put() function.

Google protocol buffer serializer provide another function which is able to write the serialized data into OutputStream.

My questions are:

  1. Is there a way to pass the serialized data to JNI (native), without copiyng and creating new objects on every call?
  2. Is there a way to allocate ByteArrays or DirectByteBuffers once and use them to pass the data from Java to JNI and vice versa (using OutputStream?)?
  3. Any other tips that might improve the performance and save garbage colection operations will be very welcome.

Thanks


Solution

  • The Google Protocol Buffers(GPB) Message implementation also offers a writeTo method.

    ByteArrayOutputStream is very close to what you are looking for, however, whenever you access the internal byte array, it creates a new copy, which doesn't sound like what you are looking for. However, if you create your own OutputStream implementation, similar to ByteArrayOutputStream, you could control the lifecycle of the internal byte array, or DirectByteBuffer. This should allow you to increase performance and reduce short-lived objects.

    writeTo
    
    void writeTo(OutputStream output)
    

    Serializes the message and writes it to output. This is just a trivial wrapper around writeTo(CodedOutputStream). This does not flush or close the stream.

    NOTE: Protocol Buffers are not self-delimiting. Therefore, if you write any more data to the stream after the message, you must somehow ensure that the parser on the receiving end does not interpret this as being part of the protocol message. This can be done e.g. by writing the size of the message before the data, then making sure to limit the input to that size on the receiving end (e.g. by wrapping the InputStream in one which limits the input). Alternatively, just use writeDelimitedTo(OutputStream).

    Throws: IOException throws IOException