Search code examples
pythonord

python ord not working as expected


so I am working on a Move to Front Encoding/Decoding assignment for a software engineering course and when using the built in ord() function with Python 3.3 it seems to be returning the wrong value at a certain point in my code.

When we want to encode a coding number from 1-120 we simply add that code number to 128. For numbers between 121 and 375 we use two bytes, the first being F9 to signify that the following one byte is part of the code number, and the second being the actual code number (encoded with the code # - 128). So for example 121 would be F9 00.

When decoding I am running into an issue where after reading in F9 and moving into the code for decoding the second byte, I run into an issue with the ord function.

My code is:

def decode_num(base_num, input_file):
    if base_num <=248:
    #coding for if the code is simply a one byte code from 1-120(will have been coded as 248)
        return base_num-128
    elif base_num == 249:
    #coding for if the code is a two byte code, thus the first byte of the code will be 121
        second_byte=ord(input_file.read(1))
        return second_byte+121

It seems to work fine until it hits the coding for 134, which should be F9 0D. The ord(input_file.read(1)) call returns 10 instead of 13 as it should. I have confirmed that in the mtf file I am trying to decode the hexdump does show F9 0D where I am running into the issue. With the current test case I am working through it only occurs with 0D as the second byte of the two byte code. 0C and back work fine, and 0E and ahead all work fine.

Any ideas at what could potentially be causing this? Or alternative ideas to decoding the two byte code number?

Edit: I forgot to mention that the mtf files will have been encoded in latin-1. If that makes a difference.


Solution

  • I found out the cause of my issue. It is due to Python and how it is dealing with the different styles of encoding. It is seeing '\r' as a new line, so it is treating '\r' and '\n' the same way. Thus when I am trying to decode 0x0d it gives me the same result as it would for 0x0a.

    I was able to resolve the issue by specifying newline as "" in my opening of the input file.

    input_file = open(input_name, encoding="latin-1", mode="r", newline="")
    

    Thanks for the help with the issue. That was the only issue, my code is acting as expected now.