Search code examples

bytes.decode() in Python2 and Python3

In the source code of sqlalchemy I see following

    val = cursor.fetchone()[0]
    if util.py3k and isinstance(val, bytes):
        val = val.decode()

Why we do decode only for Python3 and doesn't do it for Python2?


  • In Python 3, "normal" strings are Unicode (as opposed to Python 2 where they are (Extended) ASCII (or ANSI)). According to [Python 3.Docs]: Unicode HOWTO - The String Type:

    Since Python 3.0, the language’s str type contains Unicode characters, meaning any string created using "unicode rocks!", 'unicode rocks!', or the triple-quoted string syntax is stored as Unicode.


    • Python 3:

      >>> import sys
      >>> sys.version
      '3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)]'
      >>> b = b"abcd"
      >>> s = "abcd"
      >>> u = u"abcd"
      >>> type(b), type(s), type(u)
      (<class 'bytes'>, <class 'str'>, <class 'str'>)
      >>> b.decode()
      >>> s.decode()
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      AttributeError: 'str' object has no attribute 'decode'
      >>> u.decode()
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
      AttributeError: 'str' object has no attribute 'decode'
    • Python 2:

      >>> import sys
      >>> sys.version
      '2.7.10 (default, Mar  8 2016, 15:02:46) [MSC v.1600 64 bit (AMD64)]'
      >>> b = b"abcd"
      >>> s = "abcd"
      >>> u = u"abcd"
      >>> type(b), type(s), type(u)
      (<type 'str'>, <type 'str'>, <type 'unicode'>)
      >>> b.decode()
      >>> s.decode()
      >>> u.decode()

    val will be further passed (to _parse_server_version) as a str. Since in Python 3, bytes and str differ, the conversion is performed.

    You could also check [SO]: Passing utf-16 string to a Windows function (@CristiFati's answer).