pythonstringencoding

Represent string as an integer in python


I would like to be able to represent any string as a unique integer (means every integer in the world could mean only one string, and a certain string would result constantly in the same integer).

The obvious point is, that's how the computer works, representing the string 'Hello' (for example) as a number for each character, specifically a byte (assuming ASCII encoding).

But... I would like to perform arithmetic calculations over that number (Encode it as a number using RSA).

The reason this is getting messy is because assuming I have a bit larger string 'I am an average length string' I have more characters (29 in this case), and an integer with 29 bytes could come up HUGE, maybe too much for the computer to handle (when coming up with bigger strings...?).

Basically, my question is, how could I do? I wouldn't like to use any module for RSA, it's a task I would like to implement myself.


Solution

  • Here's how to turn a string into a single number. As you suspected, the number will get very large, but Python can handle integers of any arbitrary size. The usual way of working with encryption is to do individual bytes all at once, but I'm assuming this is only for a learning experience. This assumes a byte string, if you have a Unicode string you can encode to UTF-8 first.

    num = 0
    for ch in my_string:
        num = num << 8 + ord(ch)