Search code examples
pythoncsvfilesumcalculated-columns

How can calculate the sum of a column (but taking specific rows of it) in python using csv file?


Level     ColumntoSum
1           4
2          10
1          3
2          23
1          15
2          2

So imagine this is my CSV file,it contains 2 columns [Level, ColumnToSum], in Level =[1,2,1,2,1,2] and ColumnToSum has random numbers next to each level.

What I need is to calculate the sum of "ColumntoSum" with Level=1 alone and the sum of level=2 alone then I need to save it in another CSV file in this way. (Having the 2nd column contains the sum of each level)

Level  Column
1       Sum1
2       Sum2

Solution

  • After reading your CSV file with pandas:

    import pandas as pd
    df=pd.read_csv('name_of_your_file.csv')
    

    You can use pandas groupby() function to group them by Level and the sum() function to calculate the sum of each group as shown bellow:

    df=df.groupby('Level').sum()
    display(df)
    

    OUTPUT:

           ColumntoSum
    Level             
    1               22
    2               35
    

    Saving your data to CSV file:

    df.to_csv('out.csv', index=True)