Search code examples
pythonpython-3.xabstract-classabc

TypeError: __init_subclass__() takes no keyword arguments related to subclass and abstract class design


I implemented the following design using abstract class and its subclass class as follows

from abc import ABC, abstractmethod

class Pipeline(ABC):  

    @abstractmethod
    def read_data(self):
        pass
    
    def __init__(self, **kwargs):        
        self.raw_data = self.read_data()        
        self.process_data = self.raw_data[self.used_cols]

   
class case1(Pipeline):
    def read_data(self):
        return pd.read_csv("file location") # just hard coding for the file location
       
    @property
    def used_cols(self):
        return ['col_1', 'col_2','col_3','col_4']

I can invoke the class of case1 as follows. It will in fact read a csv file into pandas dataframe.

data = case1()

This existing design will return four hard coded columns, e.g., 'col_1','col_2','col_3' and 'col_4', and it just works fine. At present, I would like to control the columns to be returned by modifying the subclass, in specific, the function of used_cols. I modified class case1 as follows, but it will cause the error message.

class case1(Pipeline):
    def read_data(self):
        return pd.read_csv("file location") # just hard coding for the file location

   
    @property
    def used_cols(self, selected_cols):
        return selectd_cols

It was called as follows

selected_cols = ['col_2','col_3']
data = case1(selected_cols)

It turns out that this modification is not right, and generates the error message such as TypeError: init_subclass() takes no keyword arguments So my question is how to modify the subclass to get the desired control.


Solution

  • reference

    I think you did not fully understand the purpose of properties.

    If you create a property used_cols, you'll accessing it using obj.used_cols instead of obj.used_cols(). After creating the property it's not easily possible to call the underlying function directly.

    csv file:

    col_0,col_1,col_2,col_3
    1,1,1,2
    2,3,3,4
    3,3,3,6
    

    code:

    from abc import ABC, abstractmethod
    import pandas as pd
    class Pipeline(ABC):  
    
        @abstractmethod
        def read_data(self):
            pass
        
        def __init__(self, **kwargs):     
            self.raw_data = self.read_data()
            self.used_cols = kwargs["selected_cols"]
            self.process_data = self.raw_data[self.used_cols]
    
    class case1(Pipeline):
        def read_data(self):
            return pd.read_csv("file_location.csv") # just hard coding for the file location
    
        @property
        def used_cols(self):
            return self._used_cols
    
        @used_cols.setter
        def used_cols(self,selected_cols):
            self._used_cols = selected_cols
    
    selected_cols = ['col_2','col_3']
    data = case1(selected_cols = selected_cols)
    print(data.process_data)
    

    result:

       col_2  col_3
    0      1      2
    1      3      4
    2      3      6