Search code examples
pythonbase64ucs2

I need to get the €uro sign converted to IKw=, and it is supposed to be a base64 encoding


I have to send the €uro sign into an SMS. I was given some steps to do so, and they are:

  • Convert the € sign to hexadecimal, which is: 20AC
  • Encode the 20AC into base64, which should be: IKw=

But when I do so, with any online tool I find, I always get MjBBQw==, which is the same python returns.

So I supose I am missing some kind of character encoding between the hexadecimal and the base64.

The Python code I have is as follow:

def encodeGSM7Message( text ):
     text = unicode( text, 'UTF-8' )
     hex_text = ''.join( [ hex( ord( c ) ).rstrip('L').lstrip('0x').upper() for c in text ] )
     return  base64.b64encode( hex_text )

print encodeGSM7Message( '€' ), 'IKw='

This thing should print IKw= IKw=, but it gets to MjBBQw== IKw=.

As another example, they added Ñ to the string, so I also have an extra code line as follows:

print encodeGSM7Message( '€ÑÑ' ), 'IKwA0QDR'

But instead of printing IKwA0QDR IKwA0QDR, which should be the spected behavior, it ends printing MjBBQ0QxRDE= IKwA0QDR

Any idea about what I am missing, or what kind of unicode conversion should be done to get the expected result?


Solution

  • Try this:

    # -*- coding: utf-8 -*-
    
    def encodeGSM7Message(s):
      return base64.b64encode( s.decode('utf8').encode('utf-16-be') )
    
    euro = '€'
    
    print encodeGSM7Message(euro)
    

    Note the coding: utf-8 makes the euro variable utf-8 encoded, which is why we have to .decode('utf8') in the encodeGSM7Message routine.