I have a set of data in CSV file like this:
[['1', '1.5', '1', '2', '1.5', '2'],
['2', '2.5', '3', '2.5', '3', '2.5'],
['3', '2.5', '1.5', '1', '1', '3'],
['1.5', '1', '2', '2', '2', '2.5'],
['1.5', '1.5', '1', '2.5', '1', '3']]
I want to find the all the unique entries in this data listed in ascending order. I have tried this code:
import csv
import numpy
dim1=[]
with open('D:/TABLE/unique_values.csv') as f1:
for rows in f1.readlines():
dim1.append(rows.strip().split(','))
uniqueValues = numpy.unique(dim1)
print('Unique Values : ',uniqueValues)
and it gives me this output :
Unique Values : ['1' '1.5' '2' '2.5' '3']
I want to list these unique entries in the column in CSV file and want to write their running indices in a row against each unique entry. A sample output which is desired is shown below.
Sample Output
I have tried other numpy functions but they only return the first occurrence of unique entry. Also, I have seen other relevant posts but they do not populate the running indices of each unique element in a row.
This would be fairly straight forward with some functions from the standard library: collections.defaultdict
. csv.reader
, and itertools.count
. Something like:
import csv
import collections
import itertools
data = collections.defaultdict(list)
index = itertools.count(1)
with open('D:/TABLE/unique_values.csv') as f1:
reader = csv.reader(f1)
for row in reader:
for value in row:
data[value].append(next(index))
for unique_value, indices in data.items():
print(f"{unique_value}:", *indices)