Search code examples
javac++socketsmatlabinterprocess

Socket communication between server app and Matlab client using Java


I have a written C++ server app that I would like to be able to control from Matlab. I have used a mex function for socket communication so far, but I would like to ditch the mex function and use inline Java directly in the m files. This will be a more streamlined solution.

My C++ based standalone app expects a message with the following data in the following order . . .

This part of the protocol is fixed and cannot be changed:

  • uint32 magic_number - this is a magic number (445566) that must be at the start of the message or the rest of the message will be ignored.

  • uint32 num_bytes - this is the number of bytes used for the rest of the message block (excluding this initial 8 bytes)

This part of the protocol was designed by me and can be changed:

  • Next comes a header made of 4 uint8 values (like an ipv4 address) signalling to the app what the following data represents (if any data follows)

  • After this, the remaining bytes can represent many different things. Most commonly this would be a string (key value) followed by a long array of floating point values (audio data). However, there may just be a string, or they may just be an array of floating point values. The 4 uint8 values let the server know what to expect here.

As you can see, I am currently squeezing everything into an array of uint8 (a colossal kludge). This is because the java "write" function expects a byte array and a Matlab uint8 array is a compatible data type as I found when using the following table on the Mathworks site Passing Data to a Java Method

I'm not a Java programmer, but I have managed to get a very simple bit of communication code up and running this afternoon. Can anyone help me make this better?

import java.net.Socket
import java.io.*

mySocket = Socket('localhost', 12345);
output_stream   = mySocket.getOutputStream;
d_output_stream = DataOutputStream(output_stream);


data = zeros(12,1,'uint8');

%Magic key: use this combination of uint8s to make
% a uint32 value of = 445566 -> massive code-smell
data(1) = 126;
data(2) = 204;
data(3) = 6;

%Size of message block:
%total number of bytes in following message including header
%This is another uint32 i.e. (data(5:8))

data(5) = 4;

%header B: a group of 4 uint8s
data(9) = 1;
data(10) = 2;
data(11) = 3;
data(12) = 4;

%Main block of floats
%????


d_output_stream.write(data,0,numel(data));


pause(0.2);
mySocket.close;

I have experimented with sending a java object composed of the different parts of the data that I would like to send, but I am not sure how they end up ordered in memory. In C/C++ it is very easy to append different data types in a contiguous block of memory and then send it. Is there a simple way for me to do this here in Java? I would eventually like to make the communications 2-way also, but this can wait for now. Thanks for reading.


Solution

  • There's at least two separate issues here. One is how to structure Matlab code that speaks a protocol like this. The other his how to represent possibly complex data in this wire protocol you have.

    As far as organizing the Matlab code, you could use a class to organize the message in a more structured manner, and use typecast to convert the numbers down to bytes. Maybe something like this. This assumes your client and server have the same native representation of primitive types, and ignores network byte ordering (htonl/ntohl).

    classdef learnvst_message
        %//LEARNVST_MESSAGE Message for learnvst's example problem
        %
        % Examples:
        % msg = learnvst_message;
        % msg.payload = { 'Hello world', 1:100 }
        % msg.payloadType = uint8([ 5 12 0 0 ]);  % guessing on this
    
        properties
            magicNumber = uint32(445566);
            payloadType = zeros(4, 1, 'uint8');  %// header B
            payload = {};
        end
    
        methods
            function out = convertPayload(obj)
            %//CONVERTPAYLOAD Converts payload to a single array of bytes
            byteChunks = cellfun(@convertPayloadElement, obj.payload, 'UniformOutput',false);
            out = cat(2, byteChunks{:});
            end
    
            function out = marshall(obj)
            payloadBytes = convertPayload(obj);
            messageSize = uint32(4 + numel(payloadBytes)); %// ex first 8 bytes
            out.headerBytes = [
                typecast(obj.magicNumber, 'uint8') ...
                obj.payloadType ...
                typecast(messageSize, 'uint8')];
            out.payloadBytes = payloadBytes;
            end
    
            function sendTo(obj, host, port)
            m = marshall(obj);
            mySocket = Socket(host, port);
            d_output = mySocket.getOutputStream();
            d_output.write(m.headerBytes, 0, numel(m.headerBytes));
            d_output.write(m.messageBytes, 0, numel(m.messageBytes));
            mySocket.close();
            end
    
        end
    end
    
    function out = convertPayloadElement(x)
    if isnumeric(x)
        out = typecast(x, 'uint8');
    elseif ischar(x)
        % Assumes receiver likes 16-bit Unicode chars
        out = typecast(uint16(x), 'uint8');
    else
        % ... fill in other types here ...
        % or define a payload_element class that marshalls itself and call
        % it polymorphically
        error('Unsupported payload element type: %s', class(x));
    end
    end
    

    More readable, I think, and a bit less code smell. As a caller, you can work with the data in a more structured form, and it encapsulates the conversion to the wire-protocol bytes inside the class's marshalling method. That "convertPayload" is what "stitches together a generic block of memory together made of many different data types". In Matlab, a uint8 array is a way to append representations of different data types together in a continguous block of memory. It's basically a wrapper around an unsigned char [], with automatic reallocation. And typecast(...,'uint8') is sort of the equivalent of doing a reinterpret cast to char * in C/C++. See the help for both of them.

    But this brings up more questions. How does the server know how long each of the components of the payload are, what their shape is if multidimensional, and what their respective types are? Or what if they're complex data types - could they nest? You might need to embed little headers inside each of the payload elements. The code above assumes the 4-byte payload type header fully describes the payload contents.

    Sounds like what you're looking for may be a sort of self-describing format for heterogeneous array based data. There are existing formats for that, including NetCDF, HDF5, and Matlab's own MAT files. Matlab has built-in support for them, or you could pull in third-party Java libraries for them.

    As far as speed - You're going to have to pay each time you pass data across the Matlab/Java boundary. Large primitive arrays are relatively cheap to convert, so you probably want to pack most of the message up in a byte array in Matlab before passing it to Java, instead of making lots of separate write() calls. It'll depend in practice on how big and complex your data is. See Is MATLAB OOP slow or am I doing something wrong? for a rough idea of the cost of some Matlab operations, including Java calls. (Full disclosure: that's a self-plug.)