I'm trying to make a really simple counting script I guess using defaultdict (I can't get my head around how to use DefaultDict so if someone could comment me a snippit of code I would greatly appreciate it)
My objective is to take element 0 and element 1, merge them into a single string and then to count how many unique strings there are...
For example, in the below data there are 15 lines consisting of 3 classes, 4 classids which when merged together we only have 3 unique classes. The merged data for the first line (ignoring the title row) is: Class01CD2
uniq1,uniq2,three,four,five,six
Class01,CD2,data,data,data,data
Class01,CD2,data,data,data,data
Class01,CD2,data,data,data,data
Class01,CD2,data,data,data,data
Class02,CD3,data,data,data,data
Class02,CD3,data,data,data,data
Class02,CD3,data,data,data,data
Class02,CD3,data,data,data,data
Class02,CD3,data,data,data,data
Class02,CD3,data,data,data,data
Class02,CD3,data,data,data,data
DClass2,DE2,data,data,data,data
DClass2,DE2,data,data,data,data
Class02,CD1,data,data,data,data
Class02,CD1,data,data,data,data
The idea of it is to simply print out how many unique classes are available. Anyone able to help me work this out?
Regards
- Hyflex
Since you are dealing with CSV data, you can use the CSV module along with dictionaries:
import csv
uniq = {} #Create an empty dictionary, which we will use as a hashmap as Python dictionaries support key-value pairs.
ifile = open('data.csv', 'r') #whatever your CSV file is named.
reader = csv.reader(ifile)
for row in reader:
joined = row[0] + row[1] #The joined string is simply the first and second columns in each row.
#Check to see that the key exists, if it does increment the occurrence by 1
if joined in uniq.keys():
uniq[joined] += 1
else:
uniq[joined] = 1 #This means the key doesn't exist, so add the key to the dictionary with an occurrence of 1
print uniq #Now output the results
This outputs:
{'Class02CD3': 7, 'Class02CD1': 2, 'Class01CD2': 3, 'DClass2DE2': 2}
NOTE: This is assuming that the CSV doesn't have the header row (uniq1,uniq2,three,four,five,six
).
REFERENCES: