Search code examples
pythonpandasdataframe

I want to change Datatype of Specific Column in Data Frame


My Datatype:

TEAM           object
NATIONALITY    object
TYPE           object
PRICE PAID     object
dtype: object

Code goes here: I want convert 'PRICE PAID' Column

top_buys_df['PRICE PAID'] = pd.to_numeric(top_buys_df['PRICE PAID'])
[ top_buys_df['PRICE PAID'] = pd.to_numeric(top_buys_df['PRICE PAID'].str.replace(',', '.')) ]

When I am trying to run this line in Jupyter Notebook.

I am getting "ValueError: Unable to parse string " 24,75,00,000" at position 0"

I have tried different methods of converting

  1. astype() method:
  2. to_numeric() function:

Code Blocks execution stages:

top_buys_df.dtypes
TEAM           object
NATIONALITY    object
TYPE           object
PRICE PAID     object
dtype: object
top_buys_df['PRICE PAID']

PRICE PAID column values:

0     24,75,00,000
1     20,50,00,000
2     14,00,00,000
3     11,75,00,000
4     11,50,00,000
5     10,00,00,000
6      8,40,00,000
7      8,00,00,000
8      7,40,00,000
9      7,40,00,000
Name: PRICE PAID, dtype: object

Tried these methods individually for converting my PRICE PAID: object to int:

#top_buys_df['PRICE PAID'] = top_buys_df['PRICE PAID'].astype(int)

Error: invalid literal for int() with base 10: ' 24,75,00,000'

top_buys_df['PRICE PAID'] = pd.to_numeric(top_buys_df['PRICE PAID'].str.replace(',', '.'))

Error: ValueError: Unable to parse string " 24.75.00.000"

top_buys_df['PRICE PAID'] = pd.to_numeric(top_buys_df['PRICE PAID'])

Error: ValueError: Unable to parse string " 24.75.00.000"


Solution

  • You should replace the commas with empty strings, not dots (i.e delete the commas). Here you want to remove the thousands separator, not change the decimal separator.

    top_buys_df['PRICE PAID'] = pd.to_numeric(top_buys_df['PRICE PAID']
                                              .str.replace(',', ''))
    

    Output:

       PRICE PAID
    0   247500000
    1   205000000
    2   140000000
    3   117500000
    4   115000000
    5   100000000
    6    84000000
    7    80000000
    8    74000000
    9    74000000