Search code examples
pythonexcelcountfrequency

Python Excel number of times a negative and positive number appears (count/frequency)


I would like python to count the number of time a negative and positive number appears in binary [1 positive and 0 negative]. Furthermore, I would like Python to compute a percentage of how many positive numbers exist from the total count. I am having a very hard time figure this out when working with Python Excel.

This is the code that I have right now:

import csv

with open('Weather30states.csv', 'r') as file1:
     val = list(csv.reader(file1))[2]
     val1 = val[0:4]

with open('FL%.csv', 'r') as file2:
    reader = csv.reader(file2)
    reader.next() # this skips the first row of the file
    # this iteration will start from the second row of file2.csv
    conditionMet = False
    for row in reader:
        if conditionMet == True:
            print "FA,",row[0],',', ','.join(row[1:5])
            conditionMet = False # or break if you know you only need at most one line
        if row[1:5] == val1:
           conditionMet = True

When I run this code, what I get in the output window is this:

FA, -1.97% , 0,0,1,0
FA, -0.07% , 0,0,1,1
FA, 0.45% , 0,1,1,1
FA, -0.07% , 0,0,1,1
FA, -0.28% , 0,0,1,1

What I want to get is this:

1, 0, FA, -1.97% , 0,0,1,0
2, 0, FA, -0.07% , 0,0,1,1
3, 1, FA, 0.45% , 0,1,1,1
4, 0, FA, -0.07% , 0,0,1,1
5, 0, FA, -0.28% , 0,0,1,1

Total Count = 5
Percentage of Positive numbers = .20 %

Solution

  • Use two counter variables to track of the total count and number of positives. Set them to 0 in the beginning, and then inside of your loop, increment them by using += 1 whenever you want to add 1.

    Then test whether the percentage is greater than 0 by stripping out the percentage sign and then converting the string into a number using if float(row[0].strip('%')) > 0. You can change this to >= if you want to include 0 in the "positive" category.

    totalCount = 0
    numberOfPositives = 0
    
    with open('FL%.csv', 'r') as file2:
        reader = csv.reader(file2)
        reader.next() # this skips the first row of the file
        # this iteration will start from the second row of file2.csv
        conditionMet = False
        for row in reader:
            if conditionMet == True:
                if float(row[0].strip('%')) > 0: # change > to >= if you want to count 0 as positive
                    print "FA, 1",row[0],',', ','.join(row[1:5]) # print 1 if positive
                    numberOfPositives += 1 # add 1 to numberOfPositives only if positive
                else:
                    print "FA, 0",row[0],',', ','.join(row[1:5]) # print 0 if not positive
                totalCount += 1 # add 1 to totalCount regardless of sign
                conditionMet = False # or break if you know you only need at most one line
            if row[1:5] == val1:
               conditionMet = True
    

    Then you can calculate the sum and percentage you need from totalCount and numberOfPositives:

    print 'Total Count =', totalCount
    print 'Percentage of Positive numbers =', numberOfPositives * 100./totalCount, '%'