Search code examples
pythoncsvtabular

Count the frequency of words from a column in Python


I have a csv file. The structure of the csv file is:

Name Hour Location
A    4    San Fransisco
B    2    New York
C    4    New York
D    7    Denton
E    8    Boston
F    1    Boston

If you observe the data above, There are

2 New York and
2 Boston

I tried to use the tabular package. I tried the tutorials mentioned in the tabular package documentation since more than 7 hours. But I dint get through.

Can anyone help me, how can I extract the count of the frequent words in that Csv file in the Location column using Python.

Thank you.


Solution

  • data = """Name\tHour\tLocation
    A\t4\tSan Fransisco
    B\t2\tNew York
    C\t4\tNew York
    D\t7\tDenton
    E\t8\tBoston
    F\t1\tBoston
    """
    
    import csv
    import StringIO
    from collections import Counter
    
    
    input_stream = StringIO.StringIO(data)
    reader = csv.reader(input_stream, delimiter='\t')
    
    reader.next() #skip header
    cities = [row[2] for row in reader]
    
    for (k,v) in Counter(cities).iteritems():
        print "%s appears %d times" % (k, v)
    

    Output:

    San Fransisco appears 1 times
    Denton appears 1 times
    New York appears 2 times
    Boston appears 2 times