Search code examples
pythonstringioserial-portpyserial

python - How to read data fully from serial port?


I'm trying to read a JSON string written to serial port, using the following code based on PySerial library:

while True:
    if serial_port.in_waiting > 0:
        buffer = serial_port.readline()
        print('buffer=', buffer)
        ascii = buffer.decode('ascii')
        print('ascii=', ascii)

I tried to sniff the data on the port, and got sure that the data gets written fully without any scattering:

jpnevulator --ascii --tty "/dev/ttyACM1" --read
7B 22 30 22 3A 31 7D                            {"0":1}
7B 22 30 22 3A 32 7D                            {"0":2}
7B 22 30 22 3A 33 7D                            {"0":3}
7B 22 30 22 3A 34 7D                            {"0":4}
7B 22 30 22 3A 35 7D                            {"0":5}
7B 22 30 22 3A 36 7D                            {"0":6}

However the used code results in scattered data read, and therefore shows the following results:

buffer= b'{"0'
ascii= {"0
buffer= b'":1}'
ascii= ":1}
buffer= b'{"'
ascii= {"
buffer= b'0":2'
ascii= 0":2
buffer= b'}'
ascii= }

Also, when I use read() instead of readline(), I get the same behavior:

buffer= b'{'
data_str= {
buffer= b'"'
data_str= "
buffer= b'3'
data_str= 3
buffer= b'"'
data_str= "
buffer= b':'
data_str= :
buffer= b'1'
data_str= 1
buffer= b'}'
data_str= }

I even tried using another code that uses the same library, but got the same issue.

I'm not sure why I'm encountering such behavior.


Solution

  • I'll attempt to tackle this. :) Your loop wait for any input to become available serial_port.in_waiting > 0. Hence the behavior you are seeing. The read will start once anything can be acquired. It does not appear that PySerial would have any additional abstraction available to let your know something like last readied byte would be an ASCII curly brace character (I've really just scanned through the docs). You can always apply a generic solution of read as stuff comes and make sense out of it inside your Python script.

    One question first though. Your input example suggest you'd be dealing with equally sized string / JSON? Should we be really that lucky? If so, you could wait until that or more bytes are available and read just the desired size into your buffer.

    Otherwise a variation on your code:

    buffer = bytes()  # .read() returns bytes right?
    while True:
        if serial_port.in_waiting > 0:
            buffer += serial_port.read(serial_port.in_waiting)
            try:
                complete = buffer[:buffer.index(b'}')+1]  # get up to '}'
                buffer = buffer[buffer.index(b'}')+1:]  # leave the rest in buffer
            except ValueError:
                continue  # Go back and keep reading
            print('buffer=', complete)
            ascii = buffer.decode('ascii')
            print('ascii=', ascii)
    

    NOTE1: I presume serial_port.in_waiting could in theory change between the if and read, but I also presume unread bytes just stay on buffer and we're fine.

    NOTE2: This approach is a bit naive and does not take into account that you could have also read two chunks of JSON code.

    NOTE3: And it also does not account for nested mappings in your JSON if that is the case.

    Hopefully it's still helpful. Bottom line. Unless handling fixed size inputs or getting any other way to have pySerial feed you content chunked as desired, you have to read stuff in and process it in your script.


    UPDATE: To reflect the discussion in the comments.

    Your problem really is that you are just looking at (stream of) bytes on the serial port. At that level there isn't any useful understanding of the data being passed through. You need a higher level (application or a layer in between) making sense out of the what is coming in. In other words, to parse the protocol that encapsulates the data transferred.

    As a matter of fact, if we know string (bunch of bytes) representing JSON is what is passed through (servers as protocol, way to encapsulate/represent data (structures)), that can work, but the reassembly needs to happen above the raw serial communication. Our application (or library / module) can read the raw serial data, make sense out of it and provide them to the higher levels.