Search code examples
pythonpandas

How to calculate percentage with Pandas' DataFrame


How to add another column to Pandas' DataFrame with percentage? The dict can change on size.

>>> import pandas as pd
>>> a = {'Test 1': 4, 'Test 2': 1, 'Test 3': 1, 'Test 4': 9}
>>> p = pd.DataFrame(a.items())
>>> p
        0  1
0  Test 2  1
1  Test 3  1
2  Test 1  4
3  Test 4  9

[4 rows x 2 columns]

Solution

  • If indeed percentage of 10 is what you want, the simplest way is to adjust your intake of the data slightly:

    >>> p = pd.DataFrame(a.items(), columns=['item', 'score'])
    >>> p['perc'] = p['score']/10
    >>> p
    Out[370]: 
         item  score  perc
    0  Test 2      1   0.1
    1  Test 3      1   0.1
    2  Test 1      4   0.4
    3  Test 4      9   0.9
    

    For real percentages, instead:

    >>> p['perc']= p['score']/p['score'].sum()
    >>> p
    Out[427]: 
         item  score      perc
    0  Test 2      1  0.066667
    1  Test 3      1  0.066667
    2  Test 1      4  0.266667
    3  Test 4      9  0.600000