Search code examples
pythonsocketsnetwork-programmingtcp

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x95 in position 3: invalid start byte (Python) socket programming


I am getting the following error for my code:

 bytes_length = int(len_message.decode())
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x95 in position 3: invalid start byte

client.py

import socket
import threading
import os

def Main():
    host = '127.0.0.1' #IP Address of the system which has the software sending TCP data
    port = 2055 #TCP port number used by the software
    s = socket.socket()
    s.connect((host,port))

    i = 0
    #for i < 2:
    len_message = s.recv(4)
    print(len_message)
    while len_message:
        bytes_length = int(len_message.decode())
        #data_length = (bytes_length + 3)
        print(bytes_length)
        #print(data_length)
        data = s.recv(bytes_length)
        print(data)
        write_file(data)
        len_message = s.recv(4)
    #i+=1
    s.close()

def write_file(data):
        with open("Output.txt", "ab") as text_file:
            text_file.write(data)
            text_file.write('\n'.encode())


if __name__ == '__main__':
    Main()

The len_message is b'\x00\x00\x06\x95' which I am trying to decode. I don't understand why I am getting this error. Please help. thanks in advance.


Solution

  • len_message.decode() attempts to interpret len_message as a unicode string, while, from your context, it would seem the 4 bytes are directly an integer representation.

    You can get the bytes length as follows:

    bytes_length = int.from_bytes(len_message, 'big') # assuming big-endian bytes order.
    

    For the bytes you show, bytes_length would be 1685.