Search code examples
pythonpandasdtype

Get list of pandas dataframe columns based on data type


If I have a dataframe with the following columns:

1. NAME                                     object
2. On_Time                                      object
3. On_Budget                                    object
4. %actual_hr                                  float64
5. Baseline Start Date                  datetime64[ns]
6. Forecast Start Date                  datetime64[ns] 

I would like to be able to say: for this dataframe, give me a list of the columns which are of type 'object' or of type 'datetime'?

I have a function which converts numbers ('float64') to two decimal places, and I would like to use this list of dataframe columns, of a particular type, and run it through this function to convert them all to 2dp.

Maybe something like:

For c in col_list: if c.dtype = "Something"
list[]
List.append(c)?

Solution

  • If you want a list of columns of a certain type, you can use groupby:

    >>> df = pd.DataFrame([[1, 2.3456, 'c', 'd', 78]], columns=list("ABCDE"))
    >>> df
       A       B  C  D   E
    0  1  2.3456  c  d  78
    
    [1 rows x 5 columns]
    >>> df.dtypes
    A      int64
    B    float64
    C     object
    D     object
    E      int64
    dtype: object
    >>> g = df.columns.to_series().groupby(df.dtypes).groups
    >>> g
    {dtype('int64'): ['A', 'E'], dtype('float64'): ['B'], dtype('O'): ['C', 'D']}
    >>> {k.name: v for k, v in g.items()}
    {'object': ['C', 'D'], 'int64': ['A', 'E'], 'float64': ['B']}