Search code examples
pythonpython-3.xstringcharacter-encodingcharacter

When is `string.swapcase().swapcase()` not equal to `string`?


Documentation for str.swapcase() method says:

Return a copy of the string with uppercase characters converted to lowercase and vice versa. Note that it is not necessarily true that s.swapcase().swapcase() == s.

I can't think of an example where s.swapcase().swapcase() != s, can anyone think of one?


Solution

  • A simple example would be:

    s = "ß"
    
    print(s.swapcase().swapcase())
    

    Ouput:

    ss
    

    ß is German lowercase double s (The "correct" uppercase version would be ). The reason this happens is that the Unicode standard has defined the capitalization of ß to be SS:

    The data in this file, combined with
    # the simple case mappings in UnicodeData.txt, defines the full case mappings
    # Lowercase_Mapping (lc), Titlecase_Mapping (tc), and Uppercase_Mapping (uc).
    
    ...
    
    # The entries in this file are in the following machine-readable format:
    #
    # <code>; <lower>; <title>; <upper>; (<condition_list>;)? # <comment>
    
    ...
    
    # The German es-zed is special--the normal mapping is to SS.
    # Note: the titlecase should never occur in practice. It is equal to titlecase(uppercase(<es-zed>))
    
    00DF; 00DF; 0053 0073; 0053 0053; # LATIN SMALL LETTER SHARP S
    

    (00DF is ß, 0053 is S, and 0073 is s)