Suppose in "./data_writers/excel_data_writer.py", I have:
from generic_data_writer import GenericDataWriter
class ExcelDataWriter(GenericDataWriter):
def __init__(self, config):
super().__init__(config)
self.sheet_name = config.get('sheetname')
def write_data(self, pandas_dataframe):
pandas_dataframe.to_excel(
self.get_output_file_path_and_name(), # implemented in GenericDataWriter
sheet_name=self.sheet_name,
index=self.index)
In "./data_writers/csv_data_writer.py", I have:
from generic_data_writer import GenericDataWriter
class CSVDataWriter(GenericDataWriter):
def __init__(self, config):
super().__init__(config)
self.delimiter = config.get('delimiter')
self.encoding = config.get('encoding')
def write_data(self, pandas_dataframe):
pandas_dataframe.to_csv(
self.get_output_file_path_and_name(), # implemented in GenericDataWriter
sep=self.delimiter,
encoding=self.encoding,
index=self.index)
In "./datawriters/generic_data_writer.py", I have:
import os
class GenericDataWriter:
def __init__(self, config):
self.output_folder = config.get('output_folder')
self.output_file_name = config.get('output_file')
self.output_file_path_and_name = os.path.join(self.output_folder, self.output_file_name)
self.index = config.get('include_index') # whether to include index column from Pandas' dataframe in the output file
Suppose I have a JSON config file that has a key-value pair like this:
{
"__comment__": "Here, user can provide the path and python file name of the custom data writer module she wants to use."
"custom_data_writer_module": "./data_writers/excel_data_writer.py"
"there_are_more_key_value_pairs_in_this_JSON_config_file": "for other input parameters"
}
In "main.py", I want to import the data writer module based on the custom_data_writer_module
provided in the JSON config file above. So I wrote this:
import os
import importlib
def main():
# Do other things to read and process data
data_writer_class_file = config.get('custom_data_writer_module')
data_writer_module = importlib.import_module\
(os.path.splitext(os.path.split(data_writer_class_file)[1])[0])
dw = data_writer_module.what_should_this_be? # <=== Here, what should I do to instantiate the right specific data writer (Excel or CSV) class instance?
for df in dataframes_to_write_to_output_file:
dw.write_data(df)
if __name__ == "__main__":
main()
As I asked in the code above, I want to know if there's a way to retrieve and instantiate the class defined in a Python module assuming that there is ONLY ONE class defined in the module. Or if there is a better way to refactor my code (using some sort of pattern) without changing the structure of JSON config file described above, I'd like to learn from Python experts on StackOverflow. Thank you in advance for your suggestions!
You can do this easily with vars
:
cls1,=[v for k,v in vars(data_writer_module).items()
if isinstance(v,type)]
dw=cls1(config)
The comma enforces that exactly one class is found. If the module is allowed to do anything like from collections import deque
(or even foo=str
), you might need to filter based on v.__module__
.