Search code examples
csvpython-3.xmedian

Finding Median in 13 rows in CSV


Ok So I have looked at a couple of different questions on here but I haven't been able to find anything to help to solve this problem. I split up 303 lines with 13 rows in them between healthy patients and sick patients. I was able to get the averages of both but now I need to get the median of those 2 averages (to make things clear this is what the output should look like).

Averages of Healthy Patients:
[52.59, 0.56, 2.79, 129.25, 242.64, 0.14, 0.84, 158.38, 0.14, 0.59, 1.41, 0.27, 3.77, 0.00]
Averages of Ill Patients:
[56.63, 0.82, 3.59, 134.57, 251.47, 0.16, 1.17, 139.26, 0.55, 1.57, 1.83, 1.13, 5.80, 2.04]
Seperation Values are:
[54.61, 0.69, 3.19, 131.91, 247.06, 0.15, 1.00, 148.82, 0.34, 1.08, 1.62, 0.70, 4.79, 1.02]

I have tried different methods for trying to get the median but I have failed on all my attempts so I've officially run out of ideas of how to get it. So if you can look and see if maybe I was on the right track and just missed something small or am completely way off I would appreciate any insight on this problem.

ill_avg = [ill / len(iList) for ill in iList_sum]
hlt_avg = [ hlt / len(hList) for hlt in hList_sum]
median = [(b / len(bList) for b in bList_sum) //2 ]


print('Total of lines Processed: ' + str(numline))
print("Total Healthy Count: " + str(HPcounter))
print("Total Ill Count: " + str(IPcounter))
print("Averages of Healthy Patients:")
print(str(hlt_avg))
print("Averages of Ill Patients ")
print('[' + ', '.join(['{:.2f}'.format(number) for number in ill_avg]) + ']')
print("Seperation Values are:")
print(median)

tried to get the median by adding both averages but I couldn't get it to work and my latest try was to makes a solo average(bList which is all patients) and get the median in that. If I can make the first way work without the bList I would prefer it that way since it will make the code less redundant and hopefully smaller. I apologize I forgot to mention I am not suppose to use numpy or panda since we have not gone over those 2 in class yet.


Solution

  • Use numpy:

    import numpy
    
    a = numpy.array([[52.59, 0.56, 2.79, 129.25, 242.64, 0.14, 0.84, 158.38, 0.14, 0.59, 1.41, 0.27, 3.77, 0.00],
                     [56.63, 0.82, 3.59, 134.57, 251.47, 0.16, 1.17, 139.26, 0.55, 1.57, 1.83, 1.13, 5.80, 2.04]])
    
    print numpy.mean(a, axis=0)
    

    or use pure Python if you have to avoid numpy:

    from __future__ import division
    
    def mean(a):
        return sum(a) / len(a)
    
    a =  [[52.59, 0.56, 2.79, 129.25, 242.64, 0.14, 0.84, 158.38, 0.14, 0.59, 1.41, 0.27, 3.77, 0.00],
          [56.63, 0.82, 3.59, 134.57, 251.47, 0.16, 1.17, 139.26, 0.55, 1.57, 1.83, 1.13, 5.80, 2.04]]
    
    print map(mean, zip(*a))