I have trained an object detection network. I have a ground truth annotation CSV in the format,
filename, height, width, xmin, ymin, xmax, ymax, class
. When I score an input image, I get the output CSV in this format: filename, xmin, ymin, xmax, ymax, class, confidence
.
I need to merge these data frames considering the IoU
. SO basically the output dataframe should contain,
IoU
value if a match is foundIoU
match was not foundIoU
match was not found.This will be an intermediate step for calculating Precision and recall values.
I am just adding a very small sample dataframe as an example here, which tests for these conditions.
sample prediction:
filename xmin ymin xmax ymax class confidence
0 dummyfile.jpg 4060 2060 4214 2242 DR 0.999985
1 dummyfile.jpg 3599 1282 3732 1456 DR 0.999900
sample ground truth:
filename width height class xmin xmax ymin ymax
0 dummyfile.jpg 7201 5400 DR 3598 3728 1279 1451
1 dummyfile.jpg 7201 5400 DR 3916 4038 2186 2274
Expected final output:
I am adding my current approach as an answer. Is there any better way to achieve this? The data can be pretty large.
This is one approach I found,
import pandas as pd
import numpy as np
import os
def IOU(df):
'''funtion to calulcate IOU within rows of dataframe'''
# determining the minimum and maximum -coordinates of the intersection rectangle
xmin_inter = max(df.xmin, df.xmin_pred)
ymin_inter = max(df.ymin, df.ymin_pred)
xmax_inter = min(df.xmax, df.xmax_pred)
ymax_inter = min(df.ymax, df.ymax_pred)
# calculate area of intersection rectangle
inter_area = max(0, xmax_inter - xmin_inter + 1) * max(0, ymax_inter - ymin_inter + 1)
# calculate area of actual and predicted boxes
actual_area = (df.xmax - df.xmin + 1) * (df.ymax - df.ymin + 1)
pred_area = (df.xmax_pred - df.xmin_pred + 1) * (df.ymax_pred - df.ymin_pred+ 1)
# computing intersection over union
iou = inter_area / float(actual_area + pred_area - inter_area)
# return the intersection over union value
return iou
ground_truth=pd.read_csv("sample_gt.csv")
prediction=pd.read_csv('sample_preds.csv')
###renaming prediction df columns with _pred suffix
pred_cols=prediction.columns.tolist()
pred_cols.remove('filename')
new_cols=[col+'_pred' for col in pred_cols ]
new_col_dict=dict(zip(pred_cols,new_cols))
prediction.rename(columns=new_col_dict,inplace=True)
###outer joining the prediciton and ground truth df
newdf=pd.merge(prediction,ground_truth,'outer',on='filename')
###applying iou calculation
newdf['iou']= newdf.apply(IOU, axis = 1)
###filtering all iou=0
newdf=newdf[newdf['iou']>0]
final_df=pd.merge(prediction,newdf,on=prediction.columns.tolist(),how='left')
final_df=pd.merge(final_df,ground_truth,on=ground_truth.columns.tolist(),how='outer')