Search code examples
pythonstringcase-folding

Case folding variation to variable so input matches one of assigned variables


Creating a guess a word game and secret_word can be in any variation but how would I write different variation of secret_word is recognized by the program?

In this case, secret word is "Korea", how am I able to unify any variation or do I have to insert every different kind of variation?

secret_word = {"korea", "kOrea", "KoRea", "KoReA", "KOrea", "KORea", "Korea", "KOREA"}
guess = ""
guess_count = 0
guess_limit = 3
out_of_guesses = False

while guess != secret_word and not (out_of_guesses):
    if guess_count < guess_limit:
        guess = input("Guess a word: ")
        guess_count += 1
    else:
        out_of_guesses = True

if out_of_guesses:
print("Maybe Next time! You are out of guesses")
else:
    print("You win!")

Solution

  • In short: case insensitive checking is a much harder problem than what it appears to be at first sight. The str.casefold() function [python-doc] is supposed to produce a string for such comparisons.

    You check if the .casefold() of the entered string is the same as the .casefold() of the string to guess, like:

    secret_word = 'korea'
    guess_count = 0
    guess_limit = 3
    
    while guess_count < guess_limit:
        guess = input('Guess a word')
        if guess.casefold() == secret_word.casefold():
            break
        else:
            guess_count += 1
    
    if guess_count < guess_limit:
        print('You win')
    else:
        print('You lose')

    The .casefold() is supposed, by the Unicode standard to produce a string that can be compared for case-insensitive comparisons. For example in German, the eszett ß [wiki] maps in uppercase to:

    >>> 'ß'.lower()
    'ß'
    >>> 'ß'.upper()
    'SS'
    >>> 'SS'.lower()
    'ss'
    >>> 'ß'.lower() == 'SS'.lower()
    False
    

    whereas the .casefold() will return ss:

    >>> 'ß'.casefold()
    'ss'
    >>> 'ss'.casefold()
    'ss'
    >>> 'ß'.casefold() == 'SS'.casefold()
    True
    

    A case-insensitive comparison turns out to be a hard problem, since certain characters have no upper/lowercase equivalent, etc.