Replace Enclosed Alphanumerics in Python

How can I replace characters such as 🅰, ①,Ⓐ to their 'normal' forms in python?

Note, however, I would still like accent character to remain the same (ç,á etc)

Hoping I don't need to make a dictionary.

More examples:

https://en.wikipedia.org/wiki/Enclosed_Alphanumerics

https://en.wikipedia.org/wiki/Enclosed_Alphanumeric_Supplement

http://xahlee.info/comp/unicode_circled_numbers.html

-=-=EDIT=--=-

For future people I used this, thanks AKX

(
         "".join(
             re.sub(r'([^\w])', '', unidecode.unidecode(ch)) #return only the word characters from the decode
                 if 
                         ord(ch) > 65535  #Emoticons
                     or (ord(ch) >= 9312 and ord(ch) <= 9472)       #circled
                     or (ord(ch) >= 65296 and ord(ch) <= 65305)     #latin numbers
                     or (ord(ch) >= 65313 and ord(ch) <= 65338)     #latin upper
                     or (ord(ch) >= 65345 and ord(ch) <= 9472)      #latin lower
                 else 
                     ch 
                 for ch in criteria
                   )
              )

Solution

There's no universal way, but happily someone else has made the dictionary for you: the unidecode library works for your particular characters.

$ pip install unidecode
Successfully installed unidecode-1.3.4
$ python3
>>> import unidecode
>>> unidecode.unidecode('🅰, ①,Ⓐ')
'[A], 1,A'
>>>

Unidecode will, however, also deburr accented characters, so you'll need a bit more to keep them.

For example, assuming all the strange characters are not in the Unicode basic plane, you can filter the string character by character:

>>> "".join(unidecode.unidecode(ch) if ord(ch) > 65535 else ch for ch in '🅰 Oh lá lá')
'[A] Oh lá lá'