Search code examples
pythonlistcsvunique

How to create a list in Python with the unique values of a CSV file?


I have CSV file that looks like the following,

1994, Category1, Something Happened 1
1994, Category2, Something Happened 2
1995, Category1, Something Happened 3
1996, Category3, Something Happened 4
1998, Category2, Something Happened 5

I want to create two lists,

Category = [Category1, Category2, Category3]

and

Year = [1994, 1995, 1996, 1998]

I want to omit the duplicates in the column. I am reading the file as following,

DataCaptured = csv.reader(DataFile, delimiter=',')  
DataCaptured.next()

and Looping through,

   for Column in DataCaptured:

Solution

  • You can do:

    DataCaptured = csv.reader(DataFile, delimiter=',', skipinitialspace=True) 
    
    Category, Year = [], []
    for row in DataCaptured:
        if row[0] not in Year:
            Year.append(row[0])
        if row[1] not in Category:
            Category.append(row[1])    
    
    print Category, Year        
    # ['Category1', 'Category2', 'Category3'] ['1994', '1995', '1996', '1998']
    

    As stated in the comments, if order does not matter, using a set would be easier and faster:

    Category, Year = set(), set()
    for row in DataCaptured:
        Year.add(row[0])
        Category.add(row[1])