Search code examples
pythonmysqlphpmyadminlarge-datalarge-files

How do I split a combo list in a large text file?


my problem is that I have a very large database of emails and passwords and I need to send it to a mysql database.

The .txt file format is something like this:

emailnumberone@gmail.com:password1
emailnumbertwo@gmail.com:password2
emailnumberthree@gmail.com:password3
emailnumberfour@gmail.com:password4
emailnumberfive@gmail.com:password5

My idea is to make a loop that takes the line and make it a variable, search the ":" and pick the text before, send it to the db and then the same with the after part of the line. How do I do this?


Solution

  • Short program with some error handling:

    Create demo data file:

    t = """
    emailnumberone@gmail.com:password1
    emailnumbertwo@gmail.com:password2
    emailnumberthree@gmail.com:password3
    emailnumberfour@gmail.com:password4
    emailnumberfive@gmail.com:password5
    k
    : """
    
    with open("f.txt","w") as f: f.write(t)
    

    Parse data / store:

    def store_in_db(email,pw):
        # replace with db access code 
        # see    http://bobby-tables.com/python
        # for parametrized db code in python (or the API of your choice)
        print("stored: ", email, pw)
    
    
    with open("f.txt") as r:
        for line in r:
            if line.strip():  # weed out empty lines
                try:
                    email, pw = line.split(":",1) # even if : in pw: only split at 1st :
                    if email.strip() and pw.strip(): # only if both filled
                        store_in_db(email,pw)
                    else:
                        raise ValueError("Something is empty: '"+line+"'")
    
                except Exception as ex:
                    print("Error: ", line, ex)
    

    Output:

    stored:  emailnumberone@gmail.com password1
    
    stored:  emailnumbertwo@gmail.com password2
    
    stored:  emailnumberthree@gmail.com password3
    
    stored:  emailnumberfour@gmail.com password4
    
    stored:  emailnumberfive@gmail.com password5
    
    Error:  k
     not enough values to unpack (expected 2, got 1)
    Error:  :  Something is empty: ': '
    

    Edit: According to What characters are allowed in an email address? - a ':' may be part of the first part of an email if quoted.

    This would theoretically allow inputs as

    `"Cool:Emailadress@google.com:coolish_password"` 
    

    which will get errors with this code. See Talip Tolga Sans answer for how to break down the splitting differently to avoid this problem.