I'm attempting to replace the built in common-passwords.txt.gz
file, which supposedly contains the top 1,000 common passwords, with my own identical version which contains the top 10,000 common passwords for my country, but I've encountered some rather strange behaviour.
Firstly I directly substituted Django's common-passwords.txt.gz
file (4KB) with my own containing my .txt file with the same utf-8 encoding as Django (which comes in at 34KB), then restarted the test server. When changing a users password to "password" it does not raise the expected error as it does with Django's common password file.
The first line of both the built in password list and my new one begins 123456password12345678qwerty123456789...
so it clearly should do.
When I append a few extra passwords to their common-passwords file it appears to work as it should and raise an error if I try to use them as passwords, so I don't think that it's cached somewhere or anything like that.
Is there some kind of built in file size limit for the common password list or for the gzip.open(password_list_path).read().decode('utf-8').splitlines()
function?
Secondly, trying to figure out the above led me to a strange bug. Using Django's built in common-passwords.txt.gz
(of which the first line starts 123456password12345678qwerty123456789...
) successfully raises a validation error for "password" and "password1", but not for "password12" or "password123"!
As I read it, the Django validation code basically checks if the submitted password is in
each line from the common passwords file, and I cannot find any code that exempts passwords above a certain length from the validation. Am I missing something or is this a bug?
The "common password validation" function in Django 1.9 is found in \venv\Lib\site-packages\django\contrib\auth\password_validation.py
, the relevant class is below:
class CommonPasswordValidator(object):
"""
Validate whether the password is a common password.
The password is rejected if it occurs in a provided list, which may be gzipped.
The list Django ships with contains 1000 common passwords, created by Mark Burnett:
https://xato.net/passwords/more-top-worst-passwords/
"""
DEFAULT_PASSWORD_LIST_PATH = os.path.join(
os.path.dirname(os.path.realpath(upath(__file__))), 'common-passwords.txt.gz'
)
def __init__(self, password_list_path=DEFAULT_PASSWORD_LIST_PATH):
try:
common_passwords_lines = gzip.open(password_list_path).read().decode('utf-8').splitlines()
except IOError:
with open(password_list_path) as f:
common_passwords_lines = f.readlines()
self.passwords = {p.strip() for p in common_passwords_lines}
def validate(self, password, user=None):
if password.lower().strip() in self.passwords:
raise ValidationError(
_("This password is too common (it would be trivial to crack!)"),
code='password_too_common',
)
def get_help_text(self):
return _("Your password can't be a commonly used password.")
Finally got to the bottom of this!
There is some kind of invisible unrendered character in-between the passwords contained in Django's built in common passwords validation file, this explains both issues I encountered.
I changed my top 10k common passwords file to have the usual newline characters between them instead and now it all works great! Even though there are now 10 times as many passwords for it to compare against it still runs pretty much instantaneously!
I've uploaded my 10,000 most common passwords file to github for any future people who encounter this issue or who just want to improve Django's built-in common password validation: https://github.com/timboss/Django-Common-Password-Validation/