Search code examples
pythonpandasdataframecategoriesnumeric

Is there a python function for finding the numeric and categorical columns?


What is an efficient way of splitting/returning the categorical columns and numeric columns from the pandas data frame in python?

So far I'm using the below function for finding the categorical columns and numeric columns.

def returnCatNumList(df):
    
    object_cols = list(df.select_dtypes(exclude=['int', 'float', 'int64', 'float64', 
                                                 'int32', 'float32', 'int16', 'float16']).columns)
    numeric_cols = list(df.select_dtypes(include=['int', 'float', 'int64', 'float64', 
                                                  'int32', 'float32', 'int16', 'float16']).columns)

    return object_cols, numeric_cols

I'm looking for an efficient and better approach to do this. Any suggestions or references would be highly appreciated.


Solution

  • You can simplify your answer by np.number instead list of numeric dtypes:

    def returnCatNumList(df):
        
        object_cols = list(df.select_dtypes(exclude=np.number).columns)
        numeric_cols = list(df.select_dtypes(include=np.number).columns)
    
        return object_cols, numeric_cols
    

    Another idea is for numeric_cols use Index.difference:

    def returnCatNumList(df):
        
        object_cols = list(df.select_dtypes(exclude=np.number).columns)
        numeric_cols = list(df.columns.difference(object_cols, sort=False))
    
        return object_cols, numeric_cols