Search code examples
pythonpython-3.xcsvexport-to-csv

How can I prevent users from entering the same input twice when writing directly to .csv in python


I'm Trying to create a data-set from user input. I am trying to prevent duplicates for one of the fields. I ask the user to pick a letter that matches their name, their age, their gpa and major. I want to make sure the letters entered are unique, but I am not sure how to do this when writing directly to a .csv file.

here is what I have so far.

import csv
from colorama import Fore, Back, Style


with open('students2.csv', 'w+', newline='') as csvfile:
        columnheaders = ['NAME','AGE','GPA','MAJOR']
        writer = csv.DictWriter(csvfile, fieldnames=columnheaders)

        writer.writeheader()

        for i in range(0,10):
            askname=input('Please select the letter that matches your name from the following:  (A, B, C, D, E, F, G, H, I, J),  ')
            askage=input('Please enter your Age: ')
            askgpa=input('Please enter your GPA: ')
            askmajor=input('Please select your major from the following (CS, CET, CIS, CE) ')
            writer.writerow({'NAME': askname,'AGE': askage,'GPA': askgpa,'MAJOR': askmajor}) 
            print(Back.BLACK +'My name starts with the letter:', askname ,' and I am ', askage, 'years old. I study ', askmajor, 'and my GPA is: ', askgpa)
            print(Style.RESET_ALL)

I know how to do this with a list,

namelist = []

while True:

    #Input name
    while True:
        name = str(input('What is your name? '))
        if name.upper() not in ('A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'):
            print("Please use (A, B, C, D, E, F, G, H, I, J).")
            continue
        if name in namelist:
            print("This name has already been used.")
            continue
        else:
            namelist.append(name)
            break 

But is it possible to do this without having to do it via a list and then convert it to .csv?

Any help will be appreciated it. Thanks in advance.


Solution

  • If you can use pandas, you can do it like this:

    import pandas as pd
    df = pd.read_csv('kd.csv', index_col=0)
    df.to_csv()
    # 'NAME,AGE,GPA,MAJOR\nBill,18,4.0,CS\nMike,20,2.9,BS\nWill,20,2.4,CS\nBill,18,4.0,CS\n'
    
    df.drop_duplicates(subset=None, inplace=True)
    df.to_csv()
    # 'NAME,AGE,GPA,MAJOR\nBill,18,4.0,CS\nMike,20,2.9,BS\nWill,20,2.4,CS\n'
    

    UPDATE

    I changed it to update with your comments. Some updates, creates the file if it doesn't exist and still working on improving it with your comments. You can hit CTRL-D if you get stuck in an infinite loop.

    $ cat kd2.csv
    NAME AGE GPA MAJOR
    A 20 3.2 CIS
    B 31 4.0 CS
    C 34 3.5 CE
    D 18 2.0 CS
    E 4.0 3.2 CE
    
    import io
    
    def new_student_add():
      only_allowed = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']
      stub = io.StringIO('NAME AGE GPA MAJOR\n')
      while True:
         try:
            df = pd.read_csv('kd4.csv', delim_whitespace=True, index_col=0) 
         except:
            stub.seek(0)
            df = pd.read_csv(stub, delim_whitespace=True, index_col=0)
         new_csv_dict = {}
         try:
            new_name =input('Please select the letter that matches your name from the following:  (A, B, C, D, E, F, G, H, I, J):  ')
         except:
            break
         if new_name not in only_allowed:
            print("Only letters in {} are allowed".format(only_allowed))
            continue
         if  new_name in df.index: 
            print("This name has already been used.")
            continue
         new_csv_dict['AGE'] =input('Please enter your Age: ')
         new_csv_dict['GPA'] =input('Please enter your GPA: ')
         new_csv_dict['MAJOR'] =input('Please select your major from the following (CS, CET, CIS, CE) ')
         df.loc[new_name] = new_csv_dict   
         break 
      df.to_csv(r'kd4.csv', sep=' ')
      return df 
    
    for x in range (0,9):
       df = new_student_add()
    
    
    for name, row in  df.iterrows(): 
      print("My name starts with the letter {} and I am {} years old. I study {} and my GPA is: {}".format(name, int(row['AGE']), row['MAJOR'], row['GPA']))
    
    # This may be much faster, so adding it in in case the author needs a faster algorithm. Thanks AlexanderCécile
    # for item in df.itertuples(): 
    # print(f"My name starts with the letter {item[0]} and I am {item[1]} years old. I study {item[3]} and my GPA is: {item[2]}")