pythonpandasdataframesubclass

How to reorder columns on a subclassed pandas Dataframe


I want to reorder dataframe columns from a subclassed pandas dataframe.

I understood from this question there might be a better way for not subclassing a dataframe, but I'm still wondering how to approach this.

Without subclassing, I would do it in a classic way:

import pandas as pd

data = {'Description':['mydesc'], 'Name':['myname'], 'Symbol':['mysymbol']}
df = pd.DataFrame(data)

df = df[['Symbol', 'Name', 'Description']]

But with subclassing, keeping the same behavior as the classic one doesn't reorder the columns:

import pandas as pd

class SubDataFrame(pd.DataFrame):
    def __init__(self, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self = self._reorder_columns()
    
    def _reorder_columns(self):
        first_columns = ['Symbol', 'Name', 'Description']
        return self[first_columns + [c for c in self.columns if c not in first_columns]]
    
data = {'Description':['mydesc'], 'Name':['myname'], 'Symbol':['mysymbol']}
df = SubDataFrame(data)

I believe my mistake is in reassigning self which doesn't have any effect.

How can I achieve column reordering on the subclassed dataframe?


Solution

  • Pandas methods that have an inplace parameter use the private method _update_inplace. You could do the same, but be sure to follow future pandas development in case this method changes:

    import pandas as pd
    
    class SubDataFrame(pd.DataFrame):
        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)
            self._update_inplace(self._reorder_columns())
        
        def _reorder_columns(self):
            first_columns = ['Symbol', 'Name', 'Description']
            return self[first_columns + [c for c in self.columns if c not in first_columns]]
        
    data = {'Description':['mydesc'], 'Name':['myname'], 'Symbol':['mysymbol']}
    df = SubDataFrame(data)
    

    Output:

         Symbol    Name Description
    0  mysymbol  myname      mydesc