Suppose I have a csv data like this with thousand of rows
| Src_Mac_address |Dst_Mac_address2| Src_Mac_country|
1 x1 y1 Country1
2 x2 y2 Country2
3 x3 y3 Country3
4 x4 y4 Country4
5 x5 y5 Country5
6 x6 y6 Country1
7 x7 y7 Country1
8 x8 y8 Country2
I want to find out the frequency of each country in Src_mac_country
column and want to plot the pie chart with percentage share of that country. There are more than 30 countries in Src_mac_country
column, but I only want to plot top-10 countries according to their No. of occurences (descending order).
I tried
df=pd.read_csv('Filepath')
#df['Src_Mac_country'].value_counts()
#df.Src_Mac_country.value_counts(normalize=True).mul(100).round(1).astype(str)+'%'
This shows how many times each country has occured in Src_Mac_country
.
Next line shows the percentage of that country of that column(i.e Src_Mac_country
).
But I want to plot this data in pie chart and I want to plot top 10 countries only,according to their percentage of occurence(descending order).
How can I do that?
Use head
for filter top N rows, for percentages add autopct='%1.1f%%'
:
N = 10
df.Src_Mac_country.value_counts(normalize=True).head(N).plot.pie(autopct='%1.1f%%')
Or id need also plot all other categories sum together:
N = 10
s = df.Src_Mac_country.value_counts(normalize=True).head(N)
s.loc['other'] = 1 - s.sum()
s.plot.pie(autopct='%1.1f%%')