Search code examples
pythonpandaspython-requestspercentage

Calculate Percentage using Pandas DataFrame


Of all the Medals won by these 5 countries across all olympics, what is the percentage medals won by each one of them?

i have combined all excel file in one using panda dataframe but now stuck with finding percentage

    Country      Gold     Silver    Bronze  Total
0   USA          10       13         11      34
1   China        2        2          4       8
2   UK           1        0          1       2
3   Germany      12       16         8       36
4   Australia    2        0          0       2
0   USA          9        9          7       25
1   China        2        4          5       11
2   UK           0        1          0       1
3   Germany      11       12         6       29
4   Australia    1        0          1       2
0   USA          9        15         13      37
1   China        5        2          4       11
2   UK           1        0          0       1
3   Germany      10       13         7       30
4   Australia    2        1          0       3

Combined data sheet

Code that i have tried till now

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
df= pd.DataFrame()
for f in ['E:\\olympics\\Olympics-2002.xlsx','E:\\olympics\\Olympics- 
2006.xlsx','E:\\olympics\\Olympics-2010.xlsx',
      'E:\\olympics\\Olympics-2014.xlsx','E:\\olympics\\Olympics- 
2018.xlsx']:
data = pd.read_excel(f,'Sheet1')
df = df.append(data)

df.to_excel("E:\\olympics\\combineddata.xlsx")
data = pd.read_excel("E:\\olympics\\combineddata.xlsx")
print(data)

final_Data={}
for i in data['Country']:
x=i
t1=(data[(data.Country==x)].Total).tolist()

print("Name of Country=",i, int(sum(t1)))
final_Data.update({i:int(sum(t1))})


t3=data.groupby('Country').Total.sum()
t2= df['Total'].sum()
t4= t3/t2*100
print(t3)
print(t2)
print(t4)

this how is got the answer....Now i need to pull that in plot i want to put it pie


Solution

  • Let's assume you have created the DataFrame as 'df'. Then you can do the following to first group by and then calculate percentages.

    df = df.groupby('Country').sum()
    df['Gold_percent']   = (df['Gold']   / df['Gold'].sum()) * 100
    df['Silver_percent'] = (df['Silver'] / df['Silver'].sum()) * 100
    df['Bronze_percent'] = (df['Bronze'] / df['Bronze'].sum()) * 100
    df['Total_percent']  = (df['Total']  / df['Total'].sum()) * 100
    df.round(2)
    
    print (df)
    

    The output will be as follows:

               Gold  Silver  Bronze  ...  Silver_percent  Bronze_percent  Total_percent
    Country                          ...                                               
    Australia     5       1       1  ...            1.14            1.49           3.02
    China         9       8      13  ...            9.09           19.40          12.93
    Germany      33      41      21  ...           46.59           31.34          40.95
    UK            2       1       1  ...            1.14            1.49           1.72
    USA          28      37      31  ...           42.05           46.27          41.38