python-3.x pandas dataframe buffer-overflow

Pandas Dataframe sum result is wrong

I make program using pandas and openpyxl to manipulate excel files, series of data is:

l=[466629703, NA, 527821349, NA,734823364, NA,1667241489, NA,502673377, NA,491316417, NA,505520276, NA,2840580259, NA,1399526794, NA,468709318, NA,425220764, NA,409771252, NA,643692418, NA,1193809483, NA,353829950, NA,424820400, NA,406999623, NA,389293014, NA,1168972722, NA,420654309, NA,390431735, NA,356588382, NA]

deposit_sum = sep_df[sep_kward][deposit].dropna().astype(int).sum()

The result has to be 16188926398

But 11200862491 is the result of above code. Only one of file occurs that error. What do you think is the problem?

Solution

Don't typecast values to int after dropping NaN's convert the value to int64 because this 2840580259.0 is out of range for integer value:

deposit_sum =df[0].dropna().astype('int64').sum()
#deposit_sum =sep_df[sep_kward][deposit].dropna().astype('int64').sum()

output of deposit_sum:

16188926398

Sample dataframe used:

NA=float('NaN')
l=[466629703, NA, 527821349, NA,734823364, NA,1667241489, NA,502673377, NA,491316417, NA,505520276, NA,2840580259, NA,1399526794, NA,468709318, NA,425220764, NA,409771252, NA,643692418, NA,1193809483, NA,353829950, NA,424820400, NA,406999623, NA,389293014, NA,1168972722, NA,420654309, NA,390431735, NA,356588382, NA]
df=pd.DataFrame(l)