Search code examples
zeromqpyzmqjeromq

How to decode JeroMQ byte array in PyZMQ?


I am trying to hook up a JeroMQ publisher to a PyZMQ subscriber. It works well but I don't know how to decode/deserialize the data I am getting on the Python side.

For example, here is a byte array that I am sending from Java: [10, 10, 7, 55, 79]. My goal would be to recover the same array of integers in Python. In practice, I am getting this here b"\n\n\xf97O" on the Python end. I was hoping that a bytes.decode("utf-8") would maybe get me something like 101075579, but apparently utf-8 is the wrong codec. Do you know what kind of object b"\n\n\xf97O" is?

Here is the code I am using:

Java side

ZContext context = new ZContext();
ZMQ.Socket broadcastSocket = context.createSocket(ZMQ.PUB);
broadcastSocket.bind("tcp://*:55555");

byte[] payload = new byte[] {10, 10, 7, 55, 79};
broadcastSocket.send(payload);

Python side

context = zmq.Context()
socket = context.socket(zmq.SUB)
socket = setsockopt_string(zmq.SUBSCRIBE, "")
socket.connect("tcp://127.0.0.1:55555")


while 1:
    message = socket.recv()
    print(message) # outputs b"\n\n\xf97O"

Do you have an idea of how to solve this problem? Note that ZMQ.Socket.sendMore(String) sends objects that do get recognized by Python as strings of bytes but I am not sure how to properly parse them.

Thanks in advance.


Solution

  • it's a bytes object, also called a byte-string. If you just convert it to a list, you get a list of integers, one per byte:

    >>> list(b"\n\n\xf97O")
    [10, 10, 249, 55, 79]
    

    and you can also subscript and iterate over it directly without even using list, e.g. message[4] will be 79.

    (I'm not sure about the discrepancy between 7 and 249, but I'm guessing you miscopied something on your end, or used data from two different runs).