Search code examples
djangovalidationemailunicodechinese-locale

Django accepting Chinese pseudo letters in email fields


There is a set of letters (A-Za-z), which Chinese users sometimes input where we expect ascii letters - but those are actually special characters defined in Unicode. Take a look at this sample email address:

from django.core.validators import validate_email

email = u'dummy@raysfirst.com'

try:
    validate_email(email)
except ValidationError as e:
    print "oops! wrong email"
else:
    print "hooray! email is valid"

Sure, we can read the address. However, such an email address makes a lot of trouble in various scenarios. Typical email servers appear not to be able to handle such characters. Is this a Django bug? What's the best way to detect such letters in Python? Or better even, is there a flag for Django in order to forbid such letters in validate_email?

Update: In the meantime, I found out that such characters are likely allowed in email addresses, however support for them is so-so and they are causing a lot of trouble. Even real Chinese/Japanese/Korean characters as well as umlauts are allowed per definition. So, technically, it doesn't look like a Django bug, although it is very inconvenient at the moment.


Solution

  • From experience, the IME used to enter Chinese characters is easy to switch into "full width" mode and causes full width latin characters to be entered. You could use str.translate to revert them to non-full width, but as you pointed out, the full width characters could be valid:

    #coding:utf8
    import unicodedata as ud
    
    # Build a translation table of fullwidth to non-fullwidth characters.
    table = {}
    for i in range(65536):
        try:
            name = ud.name(chr(i))
            if name.startswith('FULLWIDTH '):
                other = ud.lookup(name[10:])
                table[i] = ord(other)
        except ValueError:
            pass
    
    email = u'dummy@raysfirst.com'
    print(email)
    print(email.translate(table))
    

    Output:

    dummy@raysfirst.com
    [email protected]