Search code examples
pythonpandasmatplotlibseabornhistogram

How to make a histogram from a list of strings


I have a list of strings:

a = ['a', 'a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'd', 'e', 'e', 'e', 'e', 'e']

I want to make a histogram for displaying the frequency distribution of the letters. I can make a list that contains the count of each letter using following codes:

from itertools import groupby
b = [len(list(group)) for key, group in groupby(a)]

How do I make the histogram? I may have a million such elements in list a.


Solution

  • Very easy with Pandas.

    import pandas
    from collections import Counter
    a = ['a', 'a', 'a', 'a', 'b', 'b', 'c', 'c', 'c', 'd', 'e', 'e', 'e', 'e', 'e']
    letter_counts = Counter(a)
    df = pandas.DataFrame.from_dict(letter_counts, orient='index')
    df.plot(kind='bar')
    

    Notice that Counter is making a frequency count, so our plot type is 'bar' not 'hist'.

    histogram of letter counts