I need to find the median of all the integers associated with each key (AA, BB). The basic format my code leads to:
AA - 21
AA - 52
BB - 3
BB - 2
My code:
def scoreData(filename):
d = dict()
fin = open(filename)
contents = fin.readlines()
for line in contents:
parts = linesplit()
part[i] = int(part[1])
if parts[0] not in d:
d[parts[0]] = list(parts[1])
else:
d[parts[0]].append(parts[1])
names = list(d.keys())
names.sort() #alphabeticez the names
print("Name\+Max\+Min\+Median")
for name in names: #makes the table
print (name"\+", max(d[name]),\+min(d[name]),"\+"median(d[name]))
I'm afraid following the same format as the "names" and "names.sort" will completely restructure the data. I've thought about "from statistics import median," but once again I do not know how to only select the values associated with each of the same keys.
Thanks in advance
You can do it easily with pandas
and numpy
:
import pandas
import numpy as np
and aggregating by first row:
score = pandas.read_csv(filename, delimiter=' - ', header=None)
print score.groupby(0).agg([np.median, np.min, np.max])
which returns:
1
median amin amax
0
AA 36.5 21 52
BB 2.5 2 3