Search code examples
pythonunicodediacriticslowercasecase-folding

python: lower() german umlauts


I have a problem with converting uppercase letters with umlauts to lowercase ones.

print("ÄÖÜAOU".lower())

The A, O and the U gets converted properly but the Ä,Ö and Ü stays uppercase. Any ideas?

First problem is fixed with the .decode('utf-8') but I still have a second one:

# -*- coding: utf-8 -*-
original_message="ÄÜ".decode('utf-8')
original_message=original_message.lower()
original_message=original_message.replace("ä", "x")
print(original_message)

Traceback (most recent call last): File "Untitled.py", line 4, in original_message=original_message.replace("ä", "x") UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 0: ordinal not in range(128)


Solution

  • You'll need to mark it as a unicode string unless you're working with plain ASCII;

    > print(u"ÄÖÜAOU".lower())
    
    äöüaou
    

    It works the same when working with variables, it all depends on the type assigned to the variable to begin with.

    > olle = "ÅÄÖABC"
    > print(olle.lower())
    ÅÄÖabc
    
    > olle = u"ÅÄÖABC"
    > print(olle.lower())
    åäöabc