I have some functions in python that share the same structure:
A couple examples:
def generate_report_1(eval_path, output_path):
df = pd.read_csv(eval_path)
missclassified_samples = df[df["miss"] == True]
missclassified_samples.to_csv(output_path)
def generate_report_2(eval_path, output_path):
df = pd.read_csv(eval_path)
dict_df = df.to_dict()
final_results = {}
for name, metric in dict_df.items():
# ... do some processing
pd.DataFrame(final_results).to_csv(output_path)
In ruby, we can use blocks to pause and return to the execution of a function using yield
. I would like to know a good practice to accomplish this in python, since this is a case of undesired repeated code.
Thanks.
No special construct is needed, just plain Python functions.
The only trick is passing a processing function as a parameter to your report function, thusly:
def generate_report(eval_path, processfunc, output_path):
df = pd.read_csv(eval_path)
result = processfunc(df)
result.to_csv(output_path)
def process_1(df):
return df[df["miss"] == True]
def process_2(df):
dict_df = df.to_dict()
final_results = {}
for name, metric in dict_df.items():
# ... do some processing
return pd.DataFrame(final_results)
# and then:
# generate_report(my_eval_path, process_1, my_output_path)