Search code examples
python-2.7pandasintjupyter-notebookvalueerror

How do I fix invalid literal for int() with base 10 error in pandas


This is the error that is showing up whenever i try to convert the dataframe to int.

("invalid literal for int() with base 10: '260,327,021'", 'occurred at index Population1'

Everything in the df is a number. I assume the error is due to the extra quote at the end but how do i fix it?


Solution

  • I run this

    int('260,327,021')
    

    and get this

    ---------------------------------------------------------------------------
    ValueError                                Traceback (most recent call last)
    <ipython-input-448-a3ba7c4bd4fe> in <module>()
    ----> 1 int('260,327,021')
    
    ValueError: invalid literal for int() with base 10: '260,327,021'
    

    I assure you that not everything in your dataframe is a number. It may look like a number, but it is a string with commas in it.

    You'll want to replace your commas and then turn to an int

    pd.Series(['260,327,021']).str.replace(',', '').astype(int)
    
    0    260327021
    dtype: int64