Search code examples
pythoncsvmatplotlibpie-chart

How to create a pie chart using matplotlib from csv


Tried to follow tutorials and answers on here but couldnt wrap my head around creating a pie chart based on my data from a csv. sample of my csv below

    post_id post_title  subreddit   polarity    subjectivity sentiment
0   bo7h4z  ['league']  soccer       -0.2             0.4    negative
1   bnvieg  ['césar']   soccer         0               0     neutral
2   bnup5q  ['foul']    soccer        0.1             0.6    positive
3   bnul4u  ['benfica'] soccer        0.45            0.5    positive
4   bnthuf  ['prediction']  soccer     0               0     neutral
5   bnolhc  ['revolution' ] soccer     0               0     neutral

There are many more rows but I need to plot the sentiment column, basically how many rows are positive, neutral or negative

outfile = open("clean_soccer.csv","r", encoding='utf-8')
file=csv.reader(outfile)
next(file, None)

post_id = []
post_title = []
subreddit = []
polarity =[]
subjectivity = []
sentiment = []

for row in file:
    post_id.append(row[0])
    post_title.append(row[1])
    subreddit.append(row[2])
    polarity.append(row[3])
    subjectivity.append(row[4])
    sentiment.append(row[5])

plt.pie( , labels=)
plt.axis('equal') 
plt.show()

Would it be something similar to this?


Solution

  • I will provide a brief answer by only reading in the sentiment column. You need split to access the sentiment column using the index [5]. Then, you can use Counter to compute the frequency and then use the values to plot the percentage in pie chart.

    import csv
    from collections import Counter
    
    outfile = open("clean_soccer.csv","r", encoding='utf-8')
    file=csv.reader(outfile)
    next(file, None)
    
    sentiment = []
    
    for row in file:
        sentiment.append(row[0].split()[5])
    
    counts = Counter(sentiment[:-1])
    plt.pie(counts.values(), labels=counts.keys(), autopct='%1.1f%%',)
    plt.axis('equal')
    plt.show()
    

    enter image description here

    EDIT: Answering your second question in the comments below

    df['sentiment'].value_counts().plot.pie(autopct='%1.1f%%',)
    plt.axis('equal')
    plt.show()