Search code examples
pandasdataframeapplynumber-formattingnumeric

Converting only specific columns in dataframe to numeric


I currently have a dataframe with n number of number-value columns and three columns that are datetime and string values. I want to convert all the columns (but three) to numeric values but am not sure what the best method is. Below is a sample dataframe (simplified):

df2 = pd.DataFrame(np.array([[1, '5-4-2016', 10], [1,'5-5-2016', 5],[2, '5- 
4-2016', 10], [2, '5-5-2016', 7], [5, '5-4-2016', 8]]), columns= ['ID', 
'Date', 'Number'])

I tried using something like (below) but was unsuccessful.

exclude = ['Date']
df = df.drop(exclude, 1).apply(pd.to_numeric, 
errors='coerce').combine_first(df)

The expected output: (essentially, the datatype of fields 'ID' and 'Number' change to floats while 'Date' stays the same)

        ID     Date         Number 
 0      1.0    5-4-2016       10.0  
 1      1.0    5-5-2016       5.0   
 2      2.0    5-4-2016       10.0 
 3      2.0    5-5-2016       7.0
 4      5.0    5-4-2016       8.0

Solution

  • Have you tried Series.astype()?

    df['ID'] = df['ID'].astype(float)
    df['Number'] = df['Number'].astype(float)
    

    or for all columns besides date:

    for col in [x for x in df.columns if x != 'Date']:
       df[col] = df[col].astype(float)
    

    or

    df[[x for x in df.columns if x != 'Date']].transform(lambda x: x.astype(float), axis=1)