Search code examples
pythonpandasdataframedata-analysis

Pandas: Dealing with a column with multiple data types


I have a column in my dataframe df, which contains values of type float and str:

df['ESTACD'].unique()

Output:
array([11.0, 32.0, 31.0, 35.0, 37.0, 84.0, 83.0, 81.0, 97.0, 39.0,
   38.0, 40.0, 34.0, 7.0, 17.0, 16.0, 14.0, 82.0, 8.0, '11', '40',
   '31', '39', '68', '97', '32', '33', '37', '38', '83', '84', '93',
   '35', '81', '67', '07', '80', '71', 'A3', '14', '17', '22', '34',
   '36', '82', '08'], dtype=object)

I wish to convert all values of this column to type string. Using astype(str) is not enough here since we end up with values like '11.0', '32.0', etc.

The only other way I could think of is using a for loop:

for i in range(len(df)):
    if (type(df['ESTACD'][i]) == float) or (df['ESTACD'][i].startswith('0')):
        df['ESTACD'][i] = str(int(df['ESTACD'][i]))

However, this is very time consuming on large datasets. Is there a way to implement this without a loop?


Solution

  • I think here should be very fast use pure python like:

    df['ESTACD'] = [str(int(x)) if isinstance(x, float) else x for x in df['ESTACD']]