Search code examples
pythonpython-3.xooppython-importlib

How to know and instantiate only one class implemented in a Python module dynamically


Suppose in "./data_writers/excel_data_writer.py", I have:

from generic_data_writer import GenericDataWriter

class ExcelDataWriter(GenericDataWriter):
   def __init__(self, config):
       super().__init__(config)
       self.sheet_name = config.get('sheetname')

   def write_data(self, pandas_dataframe):
       pandas_dataframe.to_excel(
           self.get_output_file_path_and_name(), # implemented in GenericDataWriter
           sheet_name=self.sheet_name,
           index=self.index)

In "./data_writers/csv_data_writer.py", I have:

from generic_data_writer import GenericDataWriter

class CSVDataWriter(GenericDataWriter):
   def __init__(self, config):
       super().__init__(config)
       self.delimiter = config.get('delimiter')
       self.encoding = config.get('encoding')

   def write_data(self, pandas_dataframe):
       pandas_dataframe.to_csv(
           self.get_output_file_path_and_name(), # implemented in GenericDataWriter
           sep=self.delimiter,
           encoding=self.encoding,
           index=self.index)

In "./datawriters/generic_data_writer.py", I have:

import os

class GenericDataWriter:
   def __init__(self, config):
       self.output_folder = config.get('output_folder')
       self.output_file_name = config.get('output_file')
       self.output_file_path_and_name = os.path.join(self.output_folder, self.output_file_name)
       self.index = config.get('include_index') # whether to include index column from Pandas' dataframe in the output file

Suppose I have a JSON config file that has a key-value pair like this:

{
"__comment__": "Here, user can provide the path and python file name of the custom data writer module she wants to use."
"custom_data_writer_module": "./data_writers/excel_data_writer.py"

"there_are_more_key_value_pairs_in_this_JSON_config_file": "for other input parameters"
}

In "main.py", I want to import the data writer module based on the custom_data_writer_module provided in the JSON config file above. So I wrote this:

import os
import importlib

def main():
    # Do other things to read and process data

    data_writer_class_file = config.get('custom_data_writer_module')
    data_writer_module = importlib.import_module\
            (os.path.splitext(os.path.split(data_writer_class_file)[1])[0])

    dw = data_writer_module.what_should_this_be?   # <=== Here, what should I do to instantiate the right specific data writer (Excel or CSV) class instance?
    for df in dataframes_to_write_to_output_file:
        dw.write_data(df)

if __name__ == "__main__":
    main()

As I asked in the code above, I want to know if there's a way to retrieve and instantiate the class defined in a Python module assuming that there is ONLY ONE class defined in the module. Or if there is a better way to refactor my code (using some sort of pattern) without changing the structure of JSON config file described above, I'd like to learn from Python experts on StackOverflow. Thank you in advance for your suggestions!


Solution

  • You can do this easily with vars:

    cls1,=[v for k,v in vars(data_writer_module).items()
           if isinstance(v,type)]
    dw=cls1(config)
    

    The comma enforces that exactly one class is found. If the module is allowed to do anything like from collections import deque (or even foo=str), you might need to filter based on v.__module__.