Search code examples
pythonencodingbcdebcdic

Data encoding and decoding using python


This is less of a programming question, and more of a question to understand what is what? I am not a CS major, and I am trying to understand the basic difference between these 3 formats :

1) EBCDIC 2) Unsigned binary number 3) Binary coded decimal

If this is not a real question, I apologize, but google was not very useful in explaining this to me

Say I have a string of numbers like "12890". What would their representation in

EBCDIC, Unsigned binary number and BCD format?

Is there a python 2.6 library I can use to simply convert any string of numbers to either of these formats?

For example, for string to ebcdic, I am doing

def encodeEbcdic(text):
    return text.decode('latin1').encode('cp037')

print encodeEbcdic('AGNS')

But, I get this ┴╟╒Γ


Solution

  • saulspatz, thanks for your explanation. I was able to find out what are the necessary methods needed to convert any string of numbers into their different encoding. I had to refer to Effective Python Chapter 1, Item 3 : Know the Differences Between bytes, str, and unicode

    And from there on, I read more about data types and such.

    Anyway, to answer my questions :

    1) String to EBCDIC:

    def encode_ebcdic(text):
        return text.decode('latin1').encode('cp037')
    

    The encoding here is cp037 for USA. You can use cp500 for International. Here is a list of them : https://en.wikipedia.org/wiki/List_of_EBCDIC_code_pages_with_Latin-1_character_set

    2) Hexadecimal String to unsigned binary number :

    def str_to_binary(text):
        return int(str, 16)
    

    This is pretty basic, just convert the Hexadecimal string to a number.

    3) Hexadecimal string to Binary coded decimal:

    def str_to_bcd(text):
        return bytes(str).decode('hex')
    

    Yes, you need to convert it to a byte array, so that BCD conversion can take place. Please read saulspatz answer for what BCD encoding is.