Search code examples
pythonpandaspie-chart

Plot pie chart of data in a particular column


Suppose I have a csv data like this with thousand of rows

| Src_Mac_address    |Dst_Mac_address2|    Src_Mac_country|
1    x1               y1                    Country1          
2    x2               y2                    Country2
3    x3               y3                    Country3
4    x4               y4                    Country4
5    x5               y5                    Country5
6    x6               y6                    Country1
7    x7               y7                    Country1
8    x8               y8                    Country2

I want to find out the frequency of each country in Src_mac_country column and want to plot the pie chart with percentage share of that country. There are more than 30 countries in Src_mac_country column, but I only want to plot top-10 countries according to their No. of occurences (descending order).

I tried

df=pd.read_csv('Filepath')
#df['Src_Mac_country'].value_counts()
#df.Src_Mac_country.value_counts(normalize=True).mul(100).round(1).astype(str)+'%'

This shows how many times each country has occured in Src_Mac_country. Next line shows the percentage of that country of that column(i.e Src_Mac_country).

But I want to plot this data in pie chart and I want to plot top 10 countries only,according to their percentage of occurence(descending order).

How can I do that?


Solution

  • Use head for filter top N rows, for percentages add autopct='%1.1f%%':

    N = 10
    df.Src_Mac_country.value_counts(normalize=True).head(N).plot.pie(autopct='%1.1f%%')
    

    Or id need also plot all other categories sum together:

    N = 10
    s = df.Src_Mac_country.value_counts(normalize=True).head(N)
    s.loc['other'] = 1 - s.sum()
    s.plot.pie(autopct='%1.1f%%')